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ABSTRACT 


An algorithm for maximum likelihood (ML) estimation is developed with 
an efficient method for approximating the sensitivities. The algorithm was 
developed for airplane parameter estimation problems but is well suited for 
most nonlinear, multivariable, dynamic systems. The ML algorithm relies on 
a new optimization method referred to as a modified Newton-Raphson with 
estimated sensitivities (MNRES). 

MNRES determines sensitivities by using slope information from local 
surface approximations of each output variable in parameter space. The 
fitted surface allows sensitivity information to be updated at each itera- 
tion with a significant reduction in computational effort. MNRES deter- 
mines the sensitivities with less computational effort than using either a 
finite-difference method or integrating the analytically determined sensi- 
tivity equations. MNRES eliminates the need to derive sensitivity equa- 
tions for each new model, thus eliminating algorithm reformulation with 
each new model and providing flexibility to use model equations in any 
format that is convenient. 

A random search technique for determining the confidence limits of ML 
parameter estimates is applied to nonlinear estimation problems for air- 
planes. The confidence intervals obtained by the search are compared with 
Cramer-Rao (CR) bounds at the same confidence level. It is observed that 
the degree of nonlinearity in the estimation problem is an important factor 
in the relationship between CR bounds and the error bounds determined by 
the search technique. The CR bounds were found to be close to the bounds 
determined by the search when the degree of nonlinearity was small. 
Beale’s measure of nonlinearity is developed in this study for airplane 
identification problems; it is used to empirically correct confidence 
levels for the parameter confidence limits. The primary utility of the 
measure, however, was found to be in predicting the degree of agreement 
between Cramer-Rao bounds and search estimates. 
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Chapter I 


INTRODUCTION 

Problems in dynamics may be divided into three categories. By 
considering a general dynamical system, f, with input, U, and output, Y, 
the categories can be defined as: (1) the classical problem where input, 
U, and system, f, are given and the response, Y, is to be determined; (2) 
the controls problem where system, f, and the desired response, Y, are 
given and input, U, is to be determined; and (3) the identification problem 
where input, U, and output, Y, have been measured and the system, f, is to 
be modelled. 

The theory for system identification provides a way for modelling an 
unknown system based upon input and output information. The identification 
theory incorporates a priori knowledge of the dynamic processes and sto- 
chastic processes involved; thus, the identification problem is not usually 
characterized as a black box problem. In fact, system identification prob- 
lems are usually characterized, as done in reference 1, by three factors: 
(1) class of models; (2) class of inputs; and (3) a criterion for state and 
parameter estimation. The models and inputs may be deterministic or 
stochastic and the criterion (cost function) may be based on statistical 
theory or numerical considerations. 

Implementation of the identification theory usually follows four basic 
stages. The first stage requires the design of an experiment. This 
requires the identification objectives to be specified, system configura- 
tion and conditions to be stated, and an input form selected. Determining 
an optimal input for identification can be critical to identification 
success; all the modes of a system must be excited in order to identify the 
system correctly and completely. The second stage is for model structure 
determination (a more comprehensive term is model characterization). The 
model is assumed to be linear or nonlinear, time varying or time invariant, 
with or without process noise, and with or without measurement noise, etc. 
The unknown parameters in the model may include system parameters as well 
as initial conditions, bias terms, measurement and process noise character- 
istics. The third stage involves parameter and state estimation. Para- 
meter and state estimation provide mean values and standard error esti- 
mates; these are obtained by finding an extremum of some optimality 



criterion. State estimation can be better characterized as a filtering 
problem; a Kalman filter is commonly used for this problem. The fourth 
stage is verification. This is accomplished by comparing estimates from 
different data sets and different estimation techniques* In addition, 
other sources provide comparisons; in the case of airplanes, both wind 
tunnel and theoretical predictions are used. Verification is also 
accomplished through sensitivity analysis and through analysis of residuals 
and model predictive capabilities. 

The importance of system identification theory to aircraft technology 
has developed for several reasons. A primary reason is that it provides an 
alternate approach to determining aircraft characteristics (parameters). 
Comparing results with other techniques is always good scientific prac- 
tice. Purely theoretical approaches or purely experimental approaches 
(wind tunnels) have in many instances failed to accurately predict proto- 
type characteristics. Flight testing offers an opportunity to observe 
actual vehicle performance resulting in better calibration and understand- 
ing of wind tunnel results and more accurate modelling for ground-based 
simulators. 

The development of aircraft parameter estimation paralleled the 
developments in estimation and system theory. Early flight test studies 
centered on steady-state maneuvers and free oscillations. These studies 
were time consuming and provided limited information. The main interest 
was to obtain basic aerodynamic parameters, termed stability and control 
derivatives, from linear dynamic models combined with linear aerodynamic 
models. In the early 1950 T s Greenberg and Shinbrot developed a least 
squares approach to analyze simple transient maneuvers (ref. 2,3,4). How- 
ever, without computers the simplest flight test problem with only four 
unknown parameters took 24 hours to analyze (ref. 5). A major development 
for aircraft parameter estimation occurred in the mid 1960 f s. This devel- 
opment was the introduction of large-capacity, high-speed digital computers 
and highly automated data acquisition systems. In 1968, when Larson 
applied the method of quasi- linear ization (ref. 6) and Taylor and Iliff 
(ref. 7) introduced the modified Newton- Raphson method, a new stimulus 


2 



was given to parameter estimation. Other contributions came in the early 
1970’s from Mehra (ref. 8), Mehra and Stephner (ref. 9), and Rault 
(ref. 10). 

During the past decade an increasing concern has been the application 
of estimation theory to nonlinear systems. Much of this has been stimu- 
lated by aerospace applications. Today, incorporating nonlinear dynamics 
with linear aerodynamic models is commonly performed in flight test data 
analysis. The techniques are well established for flight regimes where the 
aircraft aerodynamic model can be expressed as a linear function of states 
and control inputs. However, modelling the combination of nonlinear dyna- 
mics with nonlinear aerodynamics and estimating the parameters associated 
with that model present many difficulties. The need to identify the best 
mathematical representation (model structure) and estimate the associated 
parameters for nonlinear flight regimes has motivated further development 
of identification and estimation techniques. 

A new approach to airplane parameter estimation and confidence 
interval determination is offered in this study as a contribution toward 
building a more general and unified airplane identification methodology. 
The more general methodology starts with the work done in reference 11. In 
reference 11 a useful technique for model structure determination, where 
nonlinear aerodynamic effects are present, is suggested. The suggested 
technique uses a Modified Stepwise Regression (MSR), along with several 
testing criteria to determine a parsimonious, yet adequate, model. The 
limitation of this technique (as with any least squares method) is that the 
estimates are asymptotically biased and variance estimates are based on 
simplifying assumptions which are valid only for the "classical’* linear 
regression. This limitation can be skirted by applying the commonly used 
Maximum Likelihood (ML) technique using the model structure determined by 
the regression and the regression estimates as an initial guess. The ML 
approach has much more favorable asymptotic properties (ref. 12), and it 
provides estimates of the Cramer-Rao (CR) bounds for the parameter 
variance. 



There is a computational cost, however, for the more favorable asymp- 
totic properties of the ML technique. Dynamic systems, such as aircraft 
systems, require substantial computational effort at each step of the 
optimization process. At each step the equations of motion must be inte- 
grated to obtain time histories of each state and output variable. In 
addition, most ML algorithms use a Modified Newton-Raphson (MNR) optimiza- 
tion scheme which requires integrating sensitivity equations. This 
accounts for most of the computational effort since the number of state and 
sensitivity equations to be integrated at each iteration is equal to the 
number of states plus the product of the number of states and the number of 
unknown parameters . Several states and 20 to 30 parameters are not 
uncommon for one flight condition. If a model is desired throughout the 
entire flight envelope, the computational requirements become overwhelming 
since analysis of various flight conditions may require more than one 
candidate model. A very efficient ML estimation algorithm is desirable to 
reduce the computational requirements for processing a large number of 
parameters and candidate models. 

Besides the greater computational cost associated with the ML/MNR 
algorithm an additional difficulty in using the algorithm is that it 
requires the user to have prior knowledge of the model structure to formu- 
late the sensitivity equations and, thus, to formulate the algorithm. This 
can be very burdensome when modelling aircraft in nonlinear flight regimes 
since model structure may change significantly from one flight condition to 
another. Therefore, it is very advantageous to have an algorithm which is 
independent of sensitivity equations. 

Reducing computational requirements of the ML method requires careful 
examination of the optimization methods utilized in the algorithm. 
Although nonlinear, unconstrained optimization problems have been studied 
quite extensively (ref. 13), little has been done to improve the optimiza- 
tion techniques as they apply to aircraft estimation problems. Gupta and 
Mehra considered the numerical aspects of computing ML estimates for linear 
dynamic systems in state-vector form and methods for speeding up conver- 
gence (ref. 14). Trankle, et al . , considered the difficulties associated 
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with the use of a nonlinear dynamic model in ML parameter estimation and 
parameter covariance estimation; sensitivity calculation methods were also 
considered (ref. 15). More recently, Trankle , et al . , considered the 
overall methodology of system identification for nonlinear aerodynamic 
models including computational aspects of the problem (ref. 16). In 
reference 17, a nonlinear least-squares algorithm is developed which uses a 
linear-surface approximation of a scalar-response variable to eliminate 
derivative calculations altogether. The algorithm is applied to test prob- 
lems which do not involve dynamic systems. Presented in the current study 
is a significantly improved maximum likelihood algorithm which relies on an 
optimization scheme referred to as a modified Newton-Raphson method with 
estimated sensitivities (MNRES). A surface approximation is also used in 
MNRES; however, it is treated differently by developing an algorithm which 
retains derivative information in a Newton-Raphson method for multivari- 
able, dynamic systems. This is done to provide directional information for 
the convergence process and to provide covariance information. With the 
MNRES approach, sensitivity equations are eliminated and a significant 
reduction in computational demand is obtained. 

Another difficulty in using the ML technique is that the CR inequality 
provides only a lower bound measure of precision for an unbiased estima- 
tor. It is known from practical application of ML that this lower bound 
can differ from the variance obtained, for example, by repeated measure- 
ments (Klein ref. 18). Attempts have been made, therefore, to either 
modify the CR bounds by considering a band-limited measurement noise 
(Balakrishna , ref. 19, and Maine and Iliff, ref. 20) or to estimate the 
parameter variance directly from measured data (Rault ref. 10). Advances 
in statistical methods also came about with the availability of high-speed 
computers. Beale, in 1960, considered the problem in nonlinear estimation 
of determining the approximate parameter confidence regions using likeli- 
hood ratios (ref. 21). In addition, a measure of nonlinearity was devel- 
oped to assess the quality of the approximation. Surprisingly, Beale T s 
work has had very little application since it was published (ref. 23). In 
1979 , Mereau and Provost (ref. 23) made use of the likelihood ratio 
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approach to determine confidence regions for aircraft systems. In 1980, 
Mereau and Raymond (ref. 24) developed a search procedure to find the 
"iso-distances" defining the confidence regions. 

The goal of the current study is to provide improved techniques for 
estimating parameters and their confidence limits in nonlinear, multivari- 
able dynamic systems, in particular, aircraft systems. The improved tech- 
niques will provide: (1) increased efficiency for the estimation process; 

(2) elimination of the need for a priori knowledge of sensitivity equa- 
tions; (3) more accurate assessment of the parameter error bounds than 
obtained using Cramer-Rao bounds; and (4) an adaptation of Beale’s approach 
to the airplane estimation problem. In addition, a unified methodology for 
solving nonlinear airplane identification problems is inherent in this 
study. 

The development of this work begins in chapter II with a description 
of the airplane model and the regression method used to determine it. 
Chapter III describes the parameter estimation techniques used and their 
statistical properties. The primary estimation method used to determine 
airplane parameters, in this study, is maximum likelihood. Linear regres- 
sion methods are also presented since special forms are developed for use 
in MNRES and because they are used in the model structure determination 
scheme of Chapter II. In Chapter IV the MNRES method is presented with a 
discussion of its various forms and properties. Also presented briefly are 
some commonly used optimization methods which are used to compare with 
MNRES. Chapter V develops the theory for confidence interval estimation 
and the adaptation of Beale’s work to the airplane problem. Finally, 
chapter VI presents the results and discussion for the application of these 
methods to simulated and real data. 
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Chapter II 

MODEL CHARACTERIZATION 

Model characterization, as discussed in the introduction, establishes 
the known or assumed characteristics of the model to be used in the identi- 
fication process. Since system identification usually does not involve a 
black box problem where nothing is known about the model in advance, quali- 
tative statements describing the class of model, optimal inputs and 
statistical properties of the measurements are normally provided. 

Generally, the more information available to characterize the model 
the greater the likelihood of successful identification. Of course, 
attempting a complete representation of a dynamic system, such as an air- 
plane, Is extremely difficult, if not impossible. Actually, a complete 
model is unnecessary. The objective in identification is to select the 
simplest model that allows proper determination of the desired unknown 
parameters from measured data. The principle of parsimony is usually 
applied. This principle states that given a choice of two models having 
equal residual variances choose the model with fewer parameters. There- 
fore, the objective in identification is to choose a parsimonious, yet 
adequate, model. Very complex models may be justified to obtain an accu- 
rate description of the system motion but it is clearly detrimental to the 
estimation process. If the information content in the measured data is 
very limited, or if too many parameters are required, the estimation algo- 
rithm may provide inaccurate estimates or it may fail. 

The models considered in this study represent dynamic systems. Dyna- 
mic systems are characterized by having derivatives with respect to time 
included in the model in addition to the dependent and independent vari- 
ables. One of the possible general forms for these systems is 

x s = f(x s ,u,e,t) x s (0)=x s0 (2-1) 

Y = h(X s ,U,e,t) (2-2) 
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where X s is a vector of state variables, Y is a vector of output vari- 
ables, U is a vector of input variables, and 0 is a vector of unknown para- 
meters. The time variable, t, may or may not appear explicitly. This form 
is not as restrictive as it first appears; many problems can be cast into 
the matrix differential form above. 

Several difficulties arise when estimation techniques are applied to 
dynamic systems. A major difficulty is the significantly greater computa- 
tional demand associated with solving matrix differential equations. In an 
estimation algorithm these equations are solved repetitively. A difficulty 
can also arise when integrating the equations of motion because the bound- 
ary conditions or initial conditions are not always known exactly. There- 
fore, in many estimation problems the initial conditions are treated as 
unknown parameters. Another difficulty is that the solution to the differ- 
ential equations can be very different depending on sometimes very small 
changes in the unknown parameters. For example, a first degree system is 
stable or unstable depending only on the sign of the damping term. Non- 
linear systems can amplify this type of problem. The success of the esti- 
mation can depend on the initial guesses for the parameters since failure 
may occur when a parameter is outside a stability boundary. Unfortunately, 
obtaining stability boundaries is really only practical for linear, time- 
invariant systems. Finally, numerical difficulties with truncation and 

rounding errors are always present where numerical differentiation and 
integration are performed. 

A. AIRPLANE EQUATIONS OF MOTION 

The particular dynamic system of interest to this study is the 
airplane, modelled by equations in the general form 

X s = f(X s ,U,6) X s (0) = X s0 (2-3) 

Y = g(X s ,U,0) (2-4) 
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The equations of motion used are referred to a body axes system (See 
fig, 1), The equations were developed with the following assumptions: 

1. The airplane is a rigid body. 

2. The effect of spinning rotors is negligible. 

3. The airplane has a plane of symmetry in xz plane. 

4. There are no external disturbances to the airplane. 

5. Thrust is accounted for as part of an< ^ ^x w ^ ere 

C X = C T C ° S a T + °L Sin a “ 

C z = C T sin a T - cos a - Cp sin a 

The resulting nine equations represent a six degree-of-f reedom , 
coupled, nonlinear system where the kinematic relations are expressed in 
terms of direction cosines. They are given as follows: 

EQUATIONS OF MOTION 

u = -qw + rw + g* xz + ^ C x (2-5) 


cos a 


qS w 

v = -ru + pw + gl + — — C 
r yz m Y 


( 2 - 6 ) 
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( 2 - 8 ) 

(2-9) 
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The nondimensional aerodynamic forces and moments, C x , Cy , C z , C£ , 
C m , C n (shown in fig. 1), are usually approximated by a Taylor series 
expansion around steady trimmed flight conditions or by polynomial splines 
(see ref. 25). The form of the aerodynamic model equations is 

y(t) - e 0 + 8, Xj + e 2 * 2 + ... + 0 (2-16) 


or in vector form 


y = X e (2-17) 

where y(t) or represent one of the nondimensional aerodynamic forces 

or moments at time t or at the ith data point. The stability and control 
derivatives are represented by to 0 n ^_^ and corresponding to an 

initial trim flight condition the trim forces or moments are represented by 
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0q, The to x n represent any function of the state and 

control variables chosen for the model. The row vector X-^ is given as 

X. = [l x x 0 ... x ] (2-18) 

1 L 12 n , J 

p-1 

In general, the form of the aerodynamic equation is unknown; however, for 
estimation the form must be postulated. The form may vary significantly 
from one flight condition to another. 

The output equations for this study are as follows: 


V = 



2 

v 


+ 


2 

w 


8 = sin *(v/V) 

a = tan ^(w/u) 

G = sin 1 (-£ ) 

xz 


4> = tan 



l r • , 

a = — u+qw-rv 

x g L 



1 f 

a = — v+ru-pw 

y g L 


g *■ ] 

yz J 


1 r • 

a = — w + p v 
z g 1 


q u - g * zz ] 


(2-19) 

( 2 - 20 ) 

( 2 - 21 ) 

( 2 - 22 ) 

(2-23) 

(2-24) 

(2-25) 

(2-26) 
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The airplane identification problem can be made more tractable by 
treating longitudinal and lateral cases separately. This is accomplished 
by providing the required lateral information to the longitudinal case (or 
the required longitudinal information to the lateral case) in the form of 
measured input variables. This has been used successfully in many other 
studies, for example, reference 18. Thus, the states, outputs, and inputs 
for the two cases are given as follows: 
for the longitudinal case, 


X 

s 


[u w q 


Z Z 
xz yz 



Y = [V a q 0 a x 



U = 


6 0_ v_ 

e E E 


P E r E 



and for the lateral case, 


X 


[v p r Z 
1 r xz 


Z 

yz 



Y = 



U 



U E W E ®E 


q E a E f 


(2-27) 

(2-28) 

(2-29) 

(2-30) 

(2-31) 

(2-32) 


where the subscript E indicates a measured quantity. 

B. MODEL STRUCTURE DETERMINATION 

The goal of model structure determination is to determine an analyti- 
cal representation of the system which can be classified as an adequate 
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model. An adequate model is one which sufficiently fits the data, allows 
successful estimation of the parameters, and has good prediction capabili- 
ties. In aeronautical applications the form of the rigid body equations of 
motion is known. The only uncertainty, with regard to model structure, is 
in the aerodynamic model equations (eq. (2-16)). One of the successful 
methods for determining the model structure of these equations from 
measured data is based on stepwise regression. 

In the stepwise regression approach, after postulating the aerodynamic 
model equation, the determination of significant terms among the candidate 
variables and estimation of corresponding parameters follows. The variable 
chosen for entry into the regression equation is the one that has the 
largest correlation with y after adjusting for the effect on y of the vari- 
ables already selected. The parameters are estimated by the least squares 
technique. At every step of the regression, the variables incorporated 
into the model in previous stages and a new variable entering the model are 
reexamined. Any variable which provides a nonsignificant contribution (due 
to correlation with more recently added terms) is removed from the model. 
The process of selecting and checking variables continues until no more 
variables are admitted to the equation and no more are rejected. Experi- 
ence shows, however, that the model based only on the significance of indi- 
vidual parameters in model equation (2-16) can still include too many terms 
and, therefore, may have poor prediction capabilities. Several criteria 
for the selection of an adequate model are introduced in reference 11 and 
the details of the whole procedure are explained in references 11 and 26. 
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Chapter III 


PARAMETER ESTIMATION 

In this study Maximum Likelihood (ML) and Linear Regression (LR) 
techniques are used to estimate parameters, ML is used to estimate both 
airplane parameters and their standard errors (Cramer— Rao lower bounds) 
from flight data. The ML algorithm is used with various optimization 
schemes which are described in chapter IV. LR is used for three different 
applications in this study. The three applications are: (1) estimating 

aerodynamic model structure; (2) estimating airplane parameters (starting 
values for ML); and (3) estimating sensitivities in MNRES. The first and 
second applications of LR were accomplished using stepwise regression as 
described in the last chapter. The third application was accomplished 
using an algorithm developed in this study. 

A# LINEAR REGRESSION 

Linear regression analysis is a part of statistical theory which 
generally deals with the determination of relationships between response 
and predictor variables. One application of LR theory is curve fitting or 
surface fitting. In this application, the predictor variables (independent 
variables) are assumed to be deterministic and known without error; 
response variables (dependent variables) may have error. A numerical 
method commonly used in curve fitting to compute empirical coefficients is 
the method of Least Squares (LS). In this method, the same model form as 
equation (2-16) can be used to fit the curve or surface. The solution for 
the unknown parameters or coefficients are found by minimizing the sum of 
squares of the error between known data points and computed data points 
determined by the model. The LS method is valid only for linear problems; 
that is, problems where the unknown parameters occur linearly in the model 
regardless of whether the model structure itself is linear or nonlinear. 
LS can be solved in a batch mode or recursive mode and both modes have 
application for determining the sensitivities in MNRES. 
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A-l BATCH PROCESSING 


Batch processing of data in the LS method is probably the most 
commonly used approach for curve fitting problems. The model form given by 
equation (2-17) can be written as 

y. = X. S + e. i = 1, 2, ..., N (3-1) 

l l » > > 


where y^ is the ith value of one response variable; is the ith 

vector of predictor variables; S is the vector of unknown coefficients; and 
e-£ is the equation error at the ith data point. This error may contain 
measurement noise, process noise, and/or modelling error. However, no 
assumptions are made about the statistical properties of e. In application 
to MNRES, y± represents one element of the output vector, Y(0); X^ 
represents the ith set of values for the vector of unknown parameters, 0; 
and S is the np vector of coefficients to be computed. If a first degree 
np-polynomial expansion is chosen for X^, then each element of S will 
be the desired sensitivities (slopes). 

Applying the least squares criterion which requires minimization of 
the mean square error gives the cost function as 


J(S) 


N 

1 

i=l 


N 

- I 

i=l 


r*i 


- X, s]' 


and minimization requires 


3J(S) 

3S 


0 



Solving for S gives 


S 



(3-2) 


(3-3) 


(3-4) 


15 



Ar2 RECURSIVE PROCESSING 


Recursive processing provides a significant reduction in memory 
requirements for the MNRES algorithm. However, a specialized form of 
recursive least squares is needed for surface fitting in MNRES. Normally 
in a recursive least squares problem the purpose is to update parameter 
estimates based on N data points with some new information so that the 
updated estimates are based on N+l data points. In the following deriva- 
tion a least squares recursive algorithm is designed specifically for the 
MNRES algorithm. MNRES requires the parameters be updated using both 
incoming new information and outgoing old information so that the estimates 
are always based on a constant number of data points. 

As in the batch mode the surface fitting is performed to obtain slope 
or derivative information. Consider the least squares problem formulated 
as 

Y = XS + e (3-5) 


where Y is a vector of n data points on a surface to be fit by the model 
given as XS. S is a vector of n p unknown coefficients (slopes) and X is 
an n by n p matrix defined as 


X 



X 11 X 12 X lj 

X 21 X 22 X 2j 


(3-6) 


Ll 


X il X i2 


x. . J 


i=n, j=n 


Least squares estimation gives a solution as 

S = [x T x ] -1 X T Y n > n p 


(3-7) 
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Now defining a recursive relation for the k+1 iteration 


P 


k+1 



(3-8) 


and the updating equation as 


k+1 


= [P 


-1 




oT 




-1 


(3-9) 


where the row vector is the new set of 0 to be included in X and a° 
is the outgoing set of 0 to be removed from X which produced the highest 
value of the cost function. The recursive relation for S is 


s k+i = \ - p 


r oT o 

k+ 1 L a k 


T , T oT 

y k - a k y k + (a k a k ■ a k 




(3-10) 


where y° and y^ are one of the scalar elements of the outgoing and incom- 
ing vector Y, respectively, at the kth iteration. 

The derivation for equations (3-9) and (3-10) is as follows. Define Z 
as the common elements of X between two iterations. Partitioning X for the 
k-1 and kth iterations results in 


k-1 


(3-11) 


x k = 


a k 

z. 


(3-12) 


By using equations (3-11) and (3-12), the following relations can be 
written: 


x k-i x k-i \ \ z k z k 


(3-13) 
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(3-14) 


T T T 

X. X. = a, a. + Z. Z. 
k k k k k k 


From equation (3-13) 


T T oT o 

Z, Z. = X, , X. - a. a. 
k k k-1 k-1 k k 


(3 


Substituting equation (3-15) into (3-14) gives 


T T oT o T 

X. X. = X. , X, - a, a, + a, a, 
k k k-1 k-1 k k k k 


(3 


Substituting equation (3-16) into (3-8) gives 


P, = [xj . X. | — a° T a° + a£ a, ] 1 

k+1 L k-1 k-1 k k k k J 


(3 


which can also be written as 


k+1 


r — 1 oT o T 1 

“ t P k " a k a k + 3 k 3 kJ 


-1 


(3 


Applying the same development to equation (3-7) gives 


r T oT o , T I 

s k+i = p k+i t x k-i y k -i ' a k y k + a k y J 


(3 


and substituting equations (3-7) and (3-8) delayed a step into (3-19): 


r —i or o ii 

’k+1 = P k+1 f P k S k a k y k + a k y J 


(3 


Expanding equation (3-20) gives 


Vi ■ p k+ i p ‘‘ s k ‘ p k + i a f \ * Vi \ p k 


(3 


-15) 


-16) 


-17) 


-18) 


-19) 


- 20 ) 


- 21 ) 


18 



Noting that 


s , - p , p ii. s. = o 

k k+1 k+1 k 


(3-22) 


and then adding equation (3-21) to (3-22) gives 


s k+i s k p k+i p k+i s k + p k+i p k s k p k+i \ y k + p k+i a k y k (3 23 ) 


s,.., = s, - P k+1 [p;^ S k + p^ 1 S k + a° T y° - a* yj 


k+1 


(3-24) 


Combining terms in (3-24) gives 


S k+1 S k “ P k+1 L v * k+1 


[( p vi, - O - 


oT o 


k ; "k “k y k a k y k^ 


(3-25) 


and using equation (3—18) yields the desired relation 


S k+1 = S ’ - P 


[(aT a, - a, 


oT o ■» oT o 

k "k+1 LV “k “k “k a k ' b k a k y k 


T i 

a k y kJ 


(3-26) 


A-3 STATISTICAL PROPERTIES OF LS ESTIMATES 

Although the least squares technique, a numerical procedure, is not 
based on any statistical formulation, the least squares estimator is often 
characterized in statistical terms since the estimates can be treated as 
random variables. 

In the general least squares problem both process noise and measure- 
ment noise occur in the data. The model has the form 

Y = X 0 + w (3-27) 

t t p 

where Y t is an N by 1 vector of the response variable for the N data 
measurement points. Subscript t indicates this is the true value of the 

variable without noise. X t is an N by n p matrix of the state and input 
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variables and 0 is the np by 1 vector of unknown parameters. w p is the 

N by 1 vector of process noise. The measurements provide 

Y = Y. + v (3-28) 

t y 

X = X. + v (3-29) 

t x 

where v„ and v„ are the measurement noise in Y and X. A typical 

y A 

assumption made about the process and measurement noise is that they are 
stationary, zero mean and independent random processes. The solution to 
this least squares problem (from the last section) is 

0 = [X T x] _1 X T Y (3-30) 

Premultiplying equation (3-27) by [x T x] -1 X T and substituting equa- 
tions (3-28), (3-29), and (3-30) it is found that 

0 = 0 + [X T x] -1 X T [w + v - v 0] (3-31) 

L J L P y X J 

Therefore, the expected value of the estimate error has the form 

E { 0 - 0} = -E{ [x T x] _1 X T v x } 0 (3-32) 


and the covariance matrix is 

cov( 0 - 0) = E{(x T x) _ 1 X T e e T x(x T x)" 1 } 

+ E{(X T X) _1 X T v x 0 0 T v x X(X T X) _1 } (3-33) 

where 

e = w + v . 

P y 
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From these equations it can be seen that the least squares estimator is 
biased even if the measurement noise v x , v^, and process noise, Wp , 
are zero mean and independent. Only with the additional assumption that X 
is known without error (i.e., v x = 0), as might be the case in a curve 
fitting problem, will the estimates be unbiased. Note the covariance 
matrix is affected by all the measurement and process noise. 

B. MAXIMUM LIKELIHOOD (ML) 

The general parameter estimation problem for an airplane involves 

solving the nonlinear estimation problem in the presence of both process 
and measurement noise while modelling the airplane with the coupled, non- 
linear equations of motion. One of the advanced techniques commonly used 
for this problem because of its superior statistical properties is maximum 
likelihood. 

B-l ALGORITHM DEVELOPMENT 

Assume the outcome of an experiment is N observations of the (n Q xl) 
output vector Z^, i=l,2,...,N, which depends on the unknown parameter 

vector 0. In general, the unknown parameters are the aerodynamic para- 
meters, initial conditions, and measurement and process noise statistics. 

Let f(Z 1 ,Z 2 ,...Z N /0) be the conditional probability density function 
for the measurements given 0. The maximum likelihood estimate is the esti- 
mate for which the outcome of the experiment Z^, i=l,2,...,N is most 

likely to occur; that is, the probability density is maximized. The 
problem is stated as 


0 = max f (Z ,Z , . . . ,Z /0) 
1 z N 


(3-34) 


Using the property of joint probability density functions that 


f(Z 1 ,Z 2 ,Z 3 /0) = 


f(z 3 /z 2 ,z L ,e) 


f(z 2 /z 1 ,e) 


f^/e) 


(3-35) 


the density function can be written as 
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f(Z.,i=l,N/0) = f (V Z N-i ,0)f( Vi /Z N-2’ O) ** f(Z 2 /z i’ e) f^/0) (3-36) 


N 

= n f(z /z:_ 0) 
i=l 


(3-37) 


where 


Z' 

i-l 


= Z i-1’ 


Z i-2 ’ 


(3-38) 


If the Zj_ measurements are treated as fixed, the density function becomes 
a function of 0 only. This function is usually referred to as the likeli- 
hood function; that is, 

N 

L (6) = TT f(Z /Z' ,9) (3-39) 

i j i XXI 


Consequently, the problem of finding a maximum likelihood estimate becomes 
the problem of finding the 0 which maximizes the likelihood function. 

To define the likelihood function, the distribution of the measure- 
ments given 0 must be defined. If the distribution of the measurement and 
process noise is Gaussian, then the distribution of the measurements given 
0 will also be Gaussian and can be uniquely determined by computing the 
mean and covariance. The mean is given as 

E tV z i-i- 9 J = Vi (3_40) 

where is the best estimate of the measurement at time i given 
measurements up to and including the previous point. By definition, the 
covariance is given as 
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c °''(V z i-i- 6 1 ■ E II Z 1 - WJK - z i/i-il T l 


■ e ! v i v I) - 


(3-41) 


where are the residuals. Now the problem is to compute the condi- 
tional mean, and covariance, For systems including 

process noise, these values can be obtained by use of a Kalman filter. It 
has been shown (ref. 8) that for high sampling rates (as is commonly used 
to collect flight test data), the residuals tend toward a Gaussian 

distribution. Therefore, the distributions for both and for 

<Zi/Zi_i # e) are reasonably assumed to be Gaussian. In systems without 
process noise, some simplifications are possible. In particular, the resi- 
dual error may be written as 


\> ± = Z ± - Y ± (3-42) 

where is the predicted value of the output vector at time i. Also, 

the mean and covariance of the residuals may be assumed constant in time 
and written as 


E{v } = 0 (3-43) 

e{v^v. T } = R6^; (3-44) 

where 6 ± j is the Kronecker delta. With these assumptions, the condi- 
tional probability density function can be written as 

f(Z i /Z i-l’ 6) = (2lT) 0/2 | R | _1/2 ex Pt“ 1/2 v i R_1 V iJ (3-45) 
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Hence the likelihood function using equation (3-45) is 


N 

LjCe) = n f(Zi/z^ =1 ,0) 


-Nn 

(20 


o/2 


N 



- 1/2 


exp{ -1/2 1 vV *v } 

i=l 


(3-46) 


The negative log of the density function is more convenient to use and 
since the function is monotonic, there is no change to the problem except 
that maximizing the density function equates to minimizing the negative log 
of the density function. Thus, the more commonly used negative log likeli- 
hood function becomes 


N 

L(0) = j l (Z ± - Y i ) T R _1 (Z t - Y i ) + | £n|R| + constant 


(3-47) 


The unknown R can be estimated by minimization of the likelihood function 
with respect to R. This produces 


R 


N 


N 

l 


i=l 



where 


v 


i 


z i- 


Y.(6) 


(3-48) 


( 3-49 ) 


After substituting R into equation (3-47), the final form of the cost 
function, as used in this study, is obtained. This cost function is given 
as 


1 T 1 

J( 0) = j I (Z. - Y ± ) R (Z. - Y + constant 


(3-50) 


i=l 
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The cost function given by (3-50) is the same as that used in an 
output error technique except the measurement noise covariance matrix is 
used as a weighting matrix. The problem is now in the form of an uncon- 
strained optimization problem where the cost function given in equation 
(3-50) must be minimized with respect to the unknown parameter vector, 0. 
The unknown parameters determined by this method, for this study, are the 
airplane stability and control derivatives and trim coefficients. In 
addition, measurement noise statistics (weighting matrix) and parameter 
standard errors are determined. 

The standard errors determined are the Cramer-Rao lower bounds provid- 
ing a measure of the maximum achievable accuracy for the parameters. These 
are defined by the Cramer-Rao inequality 


f - - t 1 f a l( 0) ,-i 

e{ ( e - e) ( e - 0) } > -e{ — } 

3030 1 


(3-51) 


where 


-E{ t - M (3-52) 

aeae 1 

and M is usually referred to as the Fisher Information matrix. It Is 
assumed In this study that the approximated Hessian matrix, H, from the 
optimization procedure is a good approximation of the Fisher Information 
matrix. The solution using a gradient optimization scheme generally has 
the following form for the kth iteration 

\ + i -K-K 1 % (3 - 53) 
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where 


9J(0) 

90 


and 


g 


k 



H “ M 


B-2 STATISTICAL PROPERTIES OF ML ESTIMATES 

The maximum likelihood method is popular, especially in flight test 
data analysis, because of the excellent large sample properties of its 
estimates. Although ML estimates do not possess optimal properties for 
small samples, sampling experiments (ref. 27) have shown that the ML method 
produces acceptable estimates in many situations. The ML estimate is 
robust and sufficient when a sufficient statistic exists. The large sample 
properties of an ML estimate are summarized here (derivation of these 
properties is given by Cramer, ref. 28): 

1. Asymptotically unbiased: lim e{ 0 } = 0 

N+°° 


2. Asymptotically efficient: 


E{ ( 6-9) ( 0 — 0 ) T } > -e{ 


3 2 L(0) 
3 090 T 


-1 

} 


3. Consistent: 


lim Pr{ (0-0) <_ £ } 
N+°° 


with e arbitrarily small 


4. Asymptotically normal: 0 = n(0, M 

Asymptotic unbiasedness and consistency are very similar. However, 
consistency implies that if an estimator is consistent for 0, it is also 
consistent for any well behaved function of 0. Thus, consistency is more 
significant than unbiasedness. Asymptotic efficiency is a statement of the 
Cramer-Rao inequality; therefore, for large samples the Cramer-Rao lower 
bounds are obtained. 
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Chapter IV 


OPTIMIZATION TECHNIQUES 

The ML parameter estimates are obtained by solving an unconstrained, 
nonlinear optimization problem; that is, find 9* which minimizes the cost 
function J(0). The necessary and sufficient conditions for this problem 
are as follows: 

1. J ( 0 ) is differentiable at 0*. 

2. Vj( 0*) = 0. 

3. V 2 j(6*) > 0. 

The theory for solving unconstrained, nonlinear optimization problems is 
often based on the assumption that the cost function J(0) is a quadratic 
function of 0. This approximation provides a more tractable theory and 
allows basic theorems and properties of the optimization methods to be 
readily established. Corresponding theorems for solving general nonlinear 
functions of 0 are very difficult to prove. However, techniques developed 
using the quadratic assumption are still very effective for nonlinear func- 
tions. Many techniques for solving nonlinear minimization problems are 
developed from practical experience. 

Optimization techniques for unconstrained problems can be divided into 
two categories: derivative methods and search methods. Derivative methods 
may be further classified by the order of the derivatives used; search 
techniques can be divided into direct search and random search. Some tech- 
niques combine search and derivative methods; however, these hybrid methods 
are not considered in this study. 

The choice between optimization categories depends on the particular 
problem. Search methods determine the optimization path solely from cost 
function evaluations and, therefore, do not require as much algorithm 
preparation as needed when using sensitivity equations. Search methods 
also do not need the regularity and continuity conditions that derivative 
methods need for the cost functions. In many unconstrained, nonlinear 
programming problems, however, derivative methods will converge faster 
(ref. 13), particularly for estimation problems involving dynamic systems. 
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This was demonstrated for airplane systems both in this study and in 
reference 29. 

Various derivative techniques are available for a variety of nonlinear 
programming problems; however, no one technique is best for all problems. 
For example, the steepest descent method works better away from the minimum 
whereas Newton* s method works better near a minimum. A compromise between 
these two techniques is the modified Newton-Raphson method (MNR) . MNR 
belongs to a class of methods known as quasi-Newton or large step gradient 
methods; these methods approximate the Hessian matrix or its inverse while 
only using first derivative information. 

The derivative information can be computed in a variety of ways. For 
dynamic systems, integrating analytically-derived expressions (sensitivity 
equations) for the derivatives is probably the most accurate as well as the 
most time-consuming. One alternative is a numerical approximation scheme. 
Finite difference (f.d.) methods are often used since they eliminate the 
additional burden of deriving and incorporating sensitivity equations into 
the algorithm. However, the f.d. methods require about the same level of 
computational effort as integrating the sensitivity equations. Another 
option is the proposed surface fitting method of the MNRES algorithm 
presented in section IV-B. 


A. COMMONLY USED METHODS 

Two optimization schemes, representing the two main categories of 
methods, are selected in this study primarily to provide a benchmark com- 
parison with MNRES. They are commonly used in aircraft estimation and con- 
trol problems and, therefore, are a good indicator of the relative merit of 
MNRES. The two optimization methods are the flexible polyhedron search 
(FPS) and the modified Newton-Raphson (MNR) method. More details are pro- 
vided in references 13 and 30 for the FPS and in references 12 and 31 for 
the MNR. A variation of the MNR will be used in which the derivative 
information is computed by using finite differences (refs. 13 and 16). 
Both the f.d. form and the sensitivity equation form of MNR are used in 
this study. 
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A-l FLEXIBLE POLYHEDRON SEARCH 

Since FPS has been found to be advantageous in some aircraft design 
and control applications (ref. 32), it may be a good candidate for reducing 
computational demands in aircraft estimation problems* FPS avoids deriva- 
tive calculations, where the quasi-Newton methods spend most of the compu- 
tational time. The algorithm is independent of model form and, thus, is 
readily applicable to any aerodynamic model. 

Consider the unconstrained optimization problem of minimizing a scalar 
function of n p variables J(0). The FPS method uses a flexible polyhedron 
surface with n p + 1 vertices where each vertex is defined by a vector 
0. The vertex 0 H , producing the highest value of J(0), is projected 
through the centroid of the remaining vertices to define a new vertex. 
This new vertex, and the remaining ones without Ojj, form a new polyhe- 
dron. This operation is called a "reflection." Figure 2 shows two steps 
in this process for the case with two unknown parameters. If the new 

vertex produces a lower cost than 0^ (the vertex producing the smallest 

J(9)), then an expansion takes place and a new vertex is located farther 
out along the same projection. Similarly, if higher costs are found, a 
contraction takes place. The minimum of the cost function is found by 
repeatedly deleting the point having the highest value of J(0) and adding 
new projected points that produce lower J(0). The flexible polyhedron is 
able to adapt to the shape of J(0) by stretching down slopes, contracting 

near minima, and changing direction in curved valleys. 

A-2 MODIFIED NEWTON-RAPHSON METHOD 

This report is primarily concerned with nonlinear aircraft estimation 
problems. Since the MNR approach is commonly used for these problems, it 
is included as a benchmark algorithm. Although it is computationally 
burdensome to estimate derivatives, this information enables relatively 
fast convergence of the optimization process. In fact, Newton’s method 
converges in one pass for cost functions which are quadratic. Hence, 
Newton and quasi-Newton techniques used for estimation problems of dynamic 
systems are expected to converge faster when the quadratic approximations 
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for the cost functions are valid. Also, these methods provide both step 
size and direction for each iteration. In some problems, however, addi- 
tional control of step size is needed to ensure convergence. Since remov- 
ing the requirement to solve sensitivity equations is desirable, the MNR 
algorithm in this report will use a simple finite-difference method except 
when otherwise noted. This is not too costly in terms of computational 
time (refs. 15 and 16); however, care must be taken to obtain the deriva- 
tives as accurately as possible. 

The MNR and the MNRES algorithms are the derivative methods of 
interest to this study. As discussed in an earlier section the problem is 
to minimize the weighted square of the errors between the computed model 
outputs and the actual measured outputs. It is assumed that only the mea- 
sured outputs are corrupted by noise and that the noise is zero mean and 
uncorrelated. This leads to a nonlinear estimation of unknown parameters. 
Consider the system equations (repeating (2-3) and (2-4)) and the measure- 
ment equations, 


with 


x s - f(x g ,u,e) 


X S (0) - X s0 


Y = h(X s ,U,9) 


z i - Y i + v i 


i = 1,2,...,N 


E( v , ) = 0 and E(v,v!) « Rfi,. 

i i y ij 


(2-3) 

(2-4) 

(4-1) 

(4-2) 


where X s , U, and Y are the state, input and output vectors, respec- 
tively. 0 is the unknown parameter vector. and v^ are the measure- 

ment vector and measurement noise vector, respectively, at t = tp R is 
a diagonal measurement noise covariance matrix which is, under the above 
assumptions, equal to the covariance matrix of the residuals. Without 

process noise the ML cost function to be minimized is given by equation 

(3-50), where the added constant and multiplicative factor of 1/2 are 

dropped without affecting the solution. 
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(3-50) 


J(0) 



The matrix R is given by equation (3-48) 


where 


R 


1 

N 


N 


l 

i=l 


v v T 
i i 


V i = Z i " Y i (6 0 ) 


(3-48) 


(3-49) 


and 0q is the initial estimate of the unknown parameter vector. The MNR 
method accomplishes the minimization by expanding Y, the computed output 
vector, about 0q, the initial unknown parameter vector. A Taylor Series 
expansion of Y truncated to first order is 


Y(6) = Y(0 O ) +3Q 



(4-3) 


where A0 = 0 - 0q. Then by substituting into J, a quadratic approxima- 
tion of J is obtained. The increment A0 is the unknown. Differentiating J 
with respect to 9 and equating the derivative to zero to find a minimum 
results in 

|4 - - l gJ R -1 v + l G. R -1 G. A0 = 0 (4-4) 

30 i-1 1 1 i=l 1 1 


where 



3y. 


90 




(4-5) 
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and 9j are the kth and jth elements of the Y and 9 vectors, respec- 
tively. Solving for A9, 


A0 = 



(4-6) 


This is often written as 

(4-7) 

emphasizing the Fisher Information matrix, M, and gradient terms. For the 

A A A A 

kth iteration the estimate 0^+^ is given as = 9^ + A ®k+1* 

this study convergence is achieved when and AO^/0^ are less 

than .001. The sensitivities, G^, are determined separately from the 
above steps. This may be done by integrating the sensitivity equations or 
using a finite difference approximation or by using MNRES. 

B. MNRES METHOD 

The MNRES method developed in this paper is essentially an MNR 
optimization algorithm with an efficient method for estimating sensitivi- 
ties. As in the ML/MNR algorithm previously described, the same equations 
(eqs. (4-1) through (4-7) and (3-48) through (3-50)) apply for ML/ MNRES ; 
however, the sensitivities, G^, are computed by using slope information 
from local surface approximations of Y(9). The approximations are made 
near the series expansion point of equation (4-3). The sensitivities 
obtained from the fitted surface are determined with less computational 
effort than that obtained by either a finite-difference method or integrat- 
ing analytically-determined sensitivity equations. 

The MNRES algorithm is readily optimized for a particular application 
in that the user can select both the type of surface and the method of 
fitting the surface. Two types of surfaces which are very practical in 


4 " e ■ -x' 1 It 
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aeronautical applications are nth-order polynomials and splines. Two effi- 
cient methods of fitting the surface are by solving n p simultaneous equa- 
tions for np unknowns (algebraic solution) and by solving a redundant set 
of equations for n p unknowns (least squares solution). The tradeoffs in 
choosing a surface and a surface fitting method involve the choice between 
accuracy of the sensitivities and computational effort. 

B-l ALGEBRAIC SOLUTION 

The MNRES algorithm is best described by considering the computation- 
ally least demanding approach of using a linear-surface approximation. 
Expanding Y(0) in a first-degree polynomial in 0 for each point in time 
and at rip+1 different points in the parameter space gives 

(6^) = s, a + s, 0^ + ... + s, 0^ (4-8) 

7 ki kO kl 1 kp p 

where i indicates the ith point in time; k indicates the kth element of the 
output vector Y(0); and j indicates one of the n p +l sample points used 
to fit equation (4-8) to Y(6). Note that 

y k (ej) = y k (0) (4-9) 

at each of the n p +l points. The sample points are chosen by allowing a 
small perturbation of each parameter around the point where the sensitivi- 
ties are desired. Alternatively, the perturbation size can be selected to 
reflect the relative significance of each parameter to the model. This 
allows for larger perturbations of the less sensitive parameters and 
smaller perturbations for the very sensitive parameters, thus providing 
higher quality derivative calculations. This alternative is discussed 
further at the end of this section. The slopes s^ to s^ n are the 

desired sensitivities, ( 3 yic/ 3 9 j ) i • s^O is the value of y^CQ) 

evaluated at the series expansion point of equation (4-3). Note that 
because this is a linear surface, the slopes are constant over the surface 
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and need not be evaluated specifically at s^q. If a higher degree poly- 
nomial is fit to y^(9), the slopes will vary across the fitted surface 
and, therefore, must be evaluated specifically at s^q • Consider the 
matrix representation of (4-8) for the first element of Y and for the 
np+1 sample points at time "i": 


Note that y^ is the kth element of the output vector, Y, and yJ^ is 
the jth element of the surface fitting vector, Y^. Matrix X contains 
np+1 rows defining the np+1 sample points. Expanding equation (4-10) 
to show the vector and matrix elements gives: 



Since s^q is a known point, equation (4-11) can be simplified. The first 
line in equation (4-11) can be eliminated by subtracting it from the other 
np equations. Thus, 

AY U = AX S u (4-12) 
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where 

A0^ =0^-0° 

k k k 

Thus, at time "i" the sensitivities for the first element in Y are given by 

S li = (4-14) 


Note that the AX matrix is independent of time. This enables the sensitiv- 
ities to be calculated rapidly during each iteration of the algorithm. 
This is a key factor in reducing the computational effort of the algorithm; 
in effect, the integration of the n s n p sensitivity equations has been 
replaced by a set of n 0 matrix multiplications. 

Figure 3 shows, geometrically, two iterations for the case where 0 is 
of dimension two and a linear surface is used to fit a scalar y. The 
expansion at time "i" is 





A6 

A6 



(4-15) 
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During the first iteration, this expansion requires that y(0) be evaluated 
at n p +l=3 points: y® , y^ , y^. Computationally, the first itera- 
tion is the most costly phase of the MNRES algorithm. Each evaluation of Y 
requires that the equations of motion be integrated. The linear surface 
(indicated by the solid-line triangle in fig. 3) is fit and the slopes 
(sensitivities) are thereby determined. The algorithm proceeds, as in the 
ordinary MNR method, using equation (4-7) to obtain 


k+1 


" °k + 


A0 


k+1 


(4-16) 


The estimated sensitivity values, s^ j , are used to define the elements of 
matrix G in equation (4-6). The new Y is evaluated (by integration of 

equations of motion) at to get y^(0). At this point the MNRES 

algorithm has reduced the sensitivity problem to solving a set of simulta- 
neous equations. This is done by eliminating the 0-3 in X which produced 
the greatest contribution to the cost in J(0) and replacing that informa- 
tion with the newest estimate of 0. The new surface (indicated by the 
dotted triangle) in figure 3 assumes y^ was the high cost point and so 
eliminated it from the fitted surface. The slopes of the new surface pro- 
vide the sensitivities for the MNRES algorithm to proceed. In this scheme, 
a check should be made to ensure the new Y3(0^ + ^) produces a smaller 
value of J(0). In some cases step-size control or complete restarting may 
be needed. Note that initialization of the algorithm requires that n p +l 
integrations be performed for the n p +l trajectories, yJ . After this 
initializing pass only one integration of the system equations is needed to 
evaluate the cost, J(0); outputs, Y(0); and to update parameter estimates 
for each iteration. 

As mentioned previously, in practice it is beneficial to choose the 
perturbation size in a different fashion from that used in a simple finite- 
difference method. Simply using a 1 percent perturbation on each element 
of 0 to obtain the corresponding perturbation in each element of Y(0) is 
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not optimum for derivative calculations. Experience has shown that it is 
beneficial to use perturbation sizes which reflect the importance of the 
parameter to the model. By computing the sensitivities of 6 j ^ M j j for 

each parameter and then letting the perturbation sizes be scaled inversely 
proportional to the normalized ratios of sensitivities, more accurate deri- 
vative information can be obtained. Of course, this applies only when an 
initialization or "restart” is needed. The fundamental issue is that the 
less sensitive a parameter, the larger the perturbation necessary to obtain 
an appropriate size response in the outputs. This approach could also be 
applied to an MNR method. Theoretically, the same derivative should be 
obtained for any sufficiently small perturbation in 0; however, because of 
both the sometimes widely varying sensitivities of the parameters and the 
numerical-precision limitations, it is beneficial to vary the perturbation 
size according to the aforementioned rule. The sensitivity defined as 
was introduced in reference 33 and used again in reference 34 
as a means of quantifying the significance of a parameter to the model. 

P-2 LEAST SQUARES SOLUTION 

The least-squares approach to fitting the surface Y(0) offers another 
advantage if a recursive least-squares method is used. The recursive 

method provides a memory device reducing the storage requirements from 
np+1 sets of output time histories to just two time histories. One of 
the two corresponds to the new response predicted by the most recent esti- 
mate of 0, and the other corresponds to the outgoing 0 that produced the 
highest cost. The penalty for this advantage is the need to integrate 
equations of motion twice per iteration; this result still requires sub- 
stantially less computational effort than that required with the usual MNR 
method. 

When using the recursive least-squares approach, only two changes are 
made to the MNRES algorithm just described. The first change is in the 
calculation of the AX matrix, and the second change is in the sensitivity 
calculation. The development of this formulation will begin with equa- 
tion (4-8) in condensed form (shown below). Everything discussed up to 
this equation in the previous development applies here. 
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(4-17) 


Y 


j 

ki 


X^S 


ki 


Simplifying the notation by dropping the i subscript and writing the matrix 
form of the equation (which removes the j superscript) gives 

Y k - X S k - (4-18) 

The least squares solution for the sensitivity vector is 

S k = [x T x ] -1 X T Y k (4-19) 

Now, dropping the k subscript and letting the following apply to the kth 
element of the output vector Y, a recursive relation can be defined for the 
r+1 iteration 


P 


r+1 



(4-20) 


and the updating equation defined as 


P 


r+1 





oi-l 


i°] 
r J 


(4-21) 


where the row vector a r is the new set of 0 to be included and a° r is 
the outgoing set of 0 which produced the high cost in J. The recursive 
relation for S, which has already been derived in chapter III-A, is now 


S _ L1 = S - P ^ 
r+1 r r+1 


oT o 

7 

r 


r o' 

L a r y. - y 


T 

i 

r 


+ (a T a - a oT a°)s ] 

r v r r r r J r J 


(4-22) 
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where y° and y r are the outgoing and incoming elements of Y^, respec- 
tively, at the rth iteration. With the new sensitivities determined, the 
algorithm proceeds as before. 

fr-3 PROPERTIES OF MWRES 

Properties of MNRES are discussed in comparison with the commonly 
used MNR algorithm. This approach allows comparison of convergence charac- 
teristics and computational advantages and disadvantages against a well- 
known benchmark. 

Convergence of NR or MNR algorithms, both with and without finite- 
difference derivatives, has been well documented (ref. 13). Convergence of 
MNRES can be shown, at least heuristically , by considering several 
details. First, the MNRES method is still fundamentally an NR method or, 
for this study, an MNR method. The only critical difference is that the 
derivatives are approximate which makes MNRES closer to MNR with numeri- 
cally determined derivatives. Second, note that fitting a first-degree, 
np-term polynomial to tip+1 data points is equivalent to a simple 
finite-difference method. In effect, as A0J (the distance between points 
on the fitted surface for MNRES) becomes small enough, the sensitivities 
become identical to that given by a simple finite-difference method, 
regardless of the actual functional representation of Y(0). The MNRES 
algorithm simply relaxes the accuracy of the sensitivities In order to 
reduce substantially the integration requirements; the degree of relaxation 
varies during the optimization process but can be controlled by limiting 
step size. 

The relaxation of sensitivity accuracy generally appears to be a very 
beneficial trade-off for Newton-Raphson algorithms. Before considering 
this relaxation of sensitivity accuracy, note that during an MNRES optimi- 
zation there are two times that the MNRES scheme computes sensitivities 
which are very close to that computed by a finite-difference scheme. These 
times are, first, during the initialization or first pass of the algorithm 
and, second, toward the end of the optimization process. During 
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initialization of the algorithm, n p different 0 are chosen (a perturba- 
tion on each element of 0 is sufficient) giving n p different response 
time histories, Y(9). The surface given by Y(0) is fitted. The initial 
0J can be chosen such that 

|(e j - e°)/0°| « 1 (4-23) 

for each j. For the algebraic solution form of MNRES this is completely 
equivalent to a simple finite-difference scheme and for the least squares 
form it is a very close approximation. In this study, the same A0-3 was 
used in the MNR with finite-difference derivatives as that used in the 
initialization of MNRES. This was done for comparison purposes; in prac- 
tice, the choice of perturbation size for 0 may be very different, as dis- 
cussed later. Towards the end of optimization the MNRES scheme again 
becomes equivalent to a simple finite-difference scheme since the A0 become 
very small. This forces the surface that is fit to Y(0) to become very 
small; thus, the slope information is computed for a surface fit to a very 
small area. 

The relaxation of sensitivity accuracy occurs between the two stages 
discussed above, i.e., after initialization and before convergence. During 
this part of the optimization large A0 may occur; this is characteristic of 
NR, MNR, or MNRES algorithms. For MNRES, unlike NR or MNR, these large 
steps cause the surface fit area to expand which means the slopes or sensi- 
tivities no longer approximate the slope of the Y(0) surface at a "point," 
i.e., no longer approximate the limit requirements of a derivative but 
rather average the slope over a larger area. This is a critical time 
period for the MNRES optimization. 

Three factors aid in preventing divergence during the critical time 
period. The first factor is that as the optimization process advances, 
MNRES continually eliminates values of 0 which are far from 9*, the optimal 
solution. This, in effect, contracts the expanding surface which is 
fitting Y(0) balancing the expansion process. As the updated estimates of 
0 get close to 0* the contraction process will dominate and slope (sensi- 
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tivity) information will approach that given by a finite-difference 
scheme. The second factor is that Newton’s algorithm and variations like 
NR, MNR, and MNRES advance more quickly as the quadratic approximation of 

the cost function improves; moreover, the Newton algorithm converges in one 

step for a quadratic cost function. Since the quadratic approximation of 

the cost function will generally improve the closer 0 gets to 0*, and since 

initial estimates of 0* are often given by a least-squares procedure or 
knowledgeable user, 0q tends to be "close" to 0*. Thus, for aircraft 
estimation problems, MNRES will generally start in a region conducive for 
convergence. The third factor is that step-size control logic can always 
be incorporated. Carried to the extreme, MNRES could always be forced to 
approximate the derivatives the same as a finite-difference method. Of 
course, convergence would be very slow because of the very small steps. In 
practice, one would let the algorithm take steps determined by the NR logic 
(as done in this study); and then if a convergence problem develops, step- 
size control would be incorporated. 

The computational advantage of MNRES is tied to two primary factors. 
The first factor is the number of unknown parameters, np. Both MNR and 
MNRES must integrate n s +n g n p differential equations on the first 
iteration; after that, however, MNRES integrates only n s state equations 
each pass and MNR continues to integrate n g state plus n s n p sensitiv- 
ity equations. It appears that as n p gets larger so does the advantage 
for MNRES. A limiting factor in MNRES is in equation (4-14) where AX must 
be inverted. This n p xn p matrix becomes more difficult to invert as 

n p gets larger and, unfortunately, is made up of very small numbers as 

the optimization process converges. Also, note that the information gained 
each pass is not equivalent between the two methods. MNRES performs less 
computation each pass and, consequently, has less information to update the 
estimates. However, when sufficient passes are performed to make the work 
done by MNRES equal to MNR, MNRES will have already performed n p +l para- 
meter updates. This allows MNRES to step more quickly towards the final 

solution. MNRES will achieve convergence faster than MNR as the cost 

function becomes more quadratic in nature. 
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The second factor and primary reason for the success of MNRES has to 
do with the degree to which the cost function can be approximated by a 
quadratic function. The quadratic approximation is inherent to the Newton 
type of optimization scheme and, therefore, both MNR and MNRES improve 
performance as the quality of this approximation improves. However, 
convergence occurs more quickly with MNRES. This makes sense in light of 
the way convergence takes place. 

Convergence takes place through an iterative process where estimates 
of the unknowns are updated each iteration. The updates are estimated by 
equation (5-9) which is the product of the information matrix and the 
gradient of the cost function. It is well known that convergence can be 
speeded up by holding the information matrix constant (see e.g. ref. 14) 
for a limited number of iterations. This eliminates the need to integrate 
the sensitivity equations for a limited number of iterations; note that 
integrating the sensitivity equations accounts for the majority of the com- 
putational effort. There are two choices, each representing one extreme, 
for optimization: (1) integrate the sensitivity equations to obtain the 
most accurate derivative information for each iteration (this is the most 
costly in terms of computational effort); or (2) hold the information 
matrix constant for a limited number of iterations (this is the least 
costly in terms of computational effort and the least desirable since there 
is no way to know what number of iterations to hold the information matrix 
constant without causing divergence). A compromise between these extremes 
is preferred. 

MNRES provides this compromise in a very efficient manner. A trade- 
off between computational effort and sensitivity accuracy is made automa- 
tically by MNRES. By using the surface fitting technique only the state 
equations need be integrated each pass and this information is incorporated 
in the solution with relatively little computation. The sensitivities are 
only approximated in this process, however, their accuracy is controlled 
sufficiently to allow convergence. 

The primary disadvantage of using MNRES comes from the computer memory 
required. rip+1 sets of output variable time histories must be retained 
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in this procedure. The recursive least-squares method discussed earlier 
reduces this problem to 2 sets of time histories; however, the computa- 
tional effort increases from n s to 2n s equations to be integrated each 
pass. The user's computer system would dictate which approach is more 
appropriate. 
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Chapter V 


CONFIDENCE INTERVAL ESTIMATION 

Confidence interval estimation (CIE) is an integral part of the 
parameter estimation problem. Point estimates of parameters without any 
qualifications to indicate their accuracy are of little value. An interval 
estimate which incorporates both variance and confidence level information 
provides a complete statement of the estimate quality. Although the 
Cramer-Rao lower bound is commonly used to qualify the ML parameter esti- 
mates, it is well known that in aircraft applications these bounds do not 
accurately reflect the true parameter variance. They are usually too opti- 
mistic (ref. 18). The difference between the lower bound and the actual 
parameter variance can be due to incorrect assumptions about measurement 
and process noise, bias errors in the estimates, or modelling error. How- 
ever, the nonlinearity of the estimation problem appears to contribute 
significantly. In this chapter a method is discussed for determining 
confidence intervals by analysis of the confidence region contours using a 
search scheme. In addition, a measure of nonlinearity is developed to 
further characterize the problem. 

A. CONFIDENCE REGION DESCRIPTION 

Confidence regions are described by a surface in parameter space 
representing a certain confidence level. The surface is defined by a sta- 
tistic which reflects the distribution of error in 0. From the distribu- 
tion of the statistic, a statement can be made about the probability of the 
statistic being in a certain interval, I. Assuming the relationship 
between the statistic and the parameters can be described, a further state- 
ment can be made that the parameters are contained in region, R c , with 
the same probability. Region R c reflects the variation in 0 as the sta- 
tistic varies in interval I. The above procedure is the general process by 
which any confidence interval or region is defined. This definition will 
obviously vary according to the definition of the statistic. A useful 
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statistic for composite hypothesis tests is created using the ratio of 
likelihood functions. 

Let , Z 2 , . .., N independent random variables with 

probability density functions 

f(Z i , 0) , i = 1 , 2 , . . . ,N (5-1) 

The testing hypothesis is formulated as Hq:0cco, where oj is a subset of 
parameter space ft. Defining the likelihood functions as 

N 

L(ft) = TT f(Z , 6) , 0eft (5-2) 

i=l 


N 

L(w) = TT f(Z , 0) , 0eu) (5-3) 

i-1 

and denoting L(ft) and L(u)) as the maxima of the likelihood functions, the 
likelihood ratio is defined as 


L(uj) 
X = 

L(ft) 


(5-4) 


This ratio forms a statistic which has a value between 0 and 1 since the 
numerator is limited by the Hq hypothesis. The value of X reflects the 
degree to which the Hq hypothesis is accepted and, therefore, can be used 
as a statistic to test the hypothesis. If a probability density function 
can be defined for X and the relationship between X and 0 can be solved, 
then a confidence region, R c , can be defined. With the confidence region 
determined the confidence intervals (extrema of parameters within the 
confidence region) can be defined. 
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The confidence region, R c , provides an exact description of the 
parameter error bounds. However, for the general nonlinear estimation 
problem, an approximation may be involved in defining the confidence level 
associated with R c . To resolve this problem, Beale (ref. 21) recommended 
the statistic for the linear estimation problem be used along with a cor- 
rection factor to account for moderate nonlinearity in the model. Since 
this approach for solving the nonlinear problem is based upon a correction 
to the linear problem, the development will continue with the linear case 
first. 

A-l CIE FOR THE LINEAR ESTIMATION PROBLEM 

The estimation problem is defined to be linear if the model equations 
are linear in the unknown parameters; the state, input and response vari- 
ables may or may not appear linearly in the model equaions. The form of 
the linear estimation model (single output) is given by equation (3-5), 
repeated here using 0 as the vector of unknown parameters (the number of 
measurements is taken to be N for this discussion) 

Y - X0 + e (3-5) 

In this linear regression problem, if Y is N(X9, la^) and the testing 
hypotheses are considered as 

H q : e t = e (5-5) 

Hj! 0 * e (3-6) 

where the subscript t indicates the true value. It can be shown that the 
likelihood ratio has the form 

X = exp{ — [ J ( 9 ) - J(§)]} (5-7) 

2a 
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where 


J(0) = (Y - X0 ) T (Y - X0) (5-8) 

and 

J( 0 ) = (Y - X0) T (Y - X0) (5-9) 

The statistic A can be equivalently replaced by 

U = J(9) - J(6) (5-10) 

or in practice by the statistic 

N-n J( 0 )— J ( 0) 

F = E t (5-11) 

n p J(0) 

where F is the 1-cip point of the F(np, N-np) distribution and a p is 
the confidence level. This is possible when the model is correct and J ( 0 )— 
J(0) is independent of J ( 0 ) (ref, 26), In addition, for the linear estima- 
tion problem, it is known that (ref, 23): 

1, 0 is an unbiased estimate of 0. 

2. Cramer-Rao bound is reached. 

Once the data is determined, J(0) is a function of the n p - 

A 

dimensional parameter space. In parameter space the function J(0)-J(0) can 
be represented by the contours of a surface. The contours are defined by 
J( 0 )= cons tan t • Again considering the general linear problem (single 
output) 

Y = X 0 + e (3-5) 
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the cost, J( 9) , can be expanded as 

J( 0 ) = (Y - X0) T (Y - X0) (5-12) 

= Y T Y - 20 T X T Y + 0 T X T X0 (5-13) 

Differentiating and setting to zero, the normal equations are obtained 

0 = -X T Y + X T X0 (5-14) 

and the solution for 0 is 

0 = [x T x ] -1 X T Y (5-15) 

The ellipsoidal surface with center at 0 is expressed in terms of 0 as 

J(0) - J(0) - 0 T X T X0 - 20 T X T Y + 20 T X T Y - 0 T X T X0 (5-16) 

Substituting for X^Y from the normal equations gives 

J( 0 ) - J( 0) = (0 - 0) T X T X (0 - 0) (5-17) 

which defines an ellipsoidal surface in the n p -dimensional parameter 
space. For the linear estimation problem, the contours form an ellipsoidal 
surface with a single global minimum. The slopes and orientation of the 
contour depends on the model and data; in addition, they give an indication 
of parameter correlation and conditioning of the problem. If the contours 
are greatly elongated (indicating many values of 0 give the same cost), an 
ill-conditioned problem may exist. Inadequate data or possibly overpara- 
meterization may be the problem. 
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With the relationship in equation (5-11), a confidence ellipsoid in 

A 

rip-dimensional parameter space with center 0 is defined such that the 
probability is 100(l-ctp)% that the true parameter point 0 is contained 
within the ellipsoid. This can be expressed by substituting equa- 

tion (5-17) into (5-11) 

(0 - 0) T X T X(0 - 0) < n s 2 F (n , N- n ) (5-18) 

— p a p p p 


where 


S 2 = J(0)/(N - n ) (5-19) 

P 

The confidence limits are determined by realizing the true value of 0 
lies inside the ellipsoid if and only if it lies between all points of 
parallel tangent planes to the ellipsoid. Therefore, the true value lies 
between the two tangent planes orthogonal to vector b (b*0) if and only if 
(see ref. 35) 


|b T (0 - 0) | < (b T H o 1 b) 1/2 


(5-20) 


where 


H 


a 



N - n )}H 
P 


(5-21) 


and H is the Hessian matrix of J. Therefore, the probability is 1-dp 
that for all b 


T 

be - 


b T 0 1 


2 

n s F (n . 
P « p p’ 


m n aV 1 ^ 1 / 2 

N - n ) (b H b) 

P 


(5-22) 


This states that the probability is 1-ap that for all 0^, i=l,2,...np 


0 i - 8 1 </ks /d^ 


(5-23) 
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or, expressed in terras of confidence limits, the probability is l-a p that 
simultaneously for all 0^ 


0-/ka s <0<0 + /ko s 
0 0 


(5-24) 


where 


k = 


n 

P 





(5-25) 
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A-2 C1E FOR THE NONLINEAR ESTIMATION PROBLEM 

The nonlinear estimation problem occurs when the unknown parameters 
appear nonlinearly in the model equations* In the nonlinear problem, 

several results will change from that found in the linear case (ref. 26): 

1. Assuming t is normally distributed will not imply 0 is normally 
distributed . 

2. s^ = J(9)/(N - n p ) is no longer an unbiased estimate of o^. 

3* There is no covariance matrix in general. 

4. F tests and lack of fit tests are not valid in general. 

There are some results which remain true, however. These are: 

1. The sum of squares, J(0), still represents the square of the dis- 
tance from (Zj ,Z 2 fZg , . . . ,Z N ) to a point in the estimation 
space . 

2. Minimization of J(0) still corresponds to finding a point in esti- 
mation space closest to (Zj ,Z 2 , . • • , Z^) . 

3. Confidence regions can still be defined; however, the confidence 
level will be an approximation. 

In reference 20 Beale recommends using a correction factor, , as a 
means to extend the confidence level definition of X to moderately nonlin- 
ear problems. For this case, equation (5-11) becomes 


50 



J(9) - J(9) =nsF (n , N - n ) |l + 
pap p 1 

P 


N(n +2) \ 

— — N I 

(N-n )n (pi 
P P / 


(5-28) 


and for the multi-output case, 


2 / Nn o 

J(0 ) - J( 9 ) = n s F (n , Nn -n ) |l + 7— 
p a^p op| (Nn 


o' n p )n p ♦) 


(5-29) 


where s is now given as 


s 2 . J(5) 


(Nn -n ) 
o p 


(5-30) 


represents a measure of nonlinearity (normalized curvature) of the 
solution locus near J(0). The method of computing for the multivari- 
able aircraft estimation problem is discussed in the next section. 

Determining the confidence contours, defined by equation (5-29), can- 
not be accomplished analytically as done in the linear case since the con- 
tours are not necessarily ellipsoidal. The contours may be very irregular 
and possibly with several local and global minima. Figure 4 shows the 
construction of a confidence interval for a one-dimensional problem. The 
solid and dashed curves in the figure represent nonlinear and linear cases, 
respectively. In parameter space, the dashed curve would form a symmetric 
ellipsoidal surface, whereas the solid curve would vary from this shape 
depending on the degree of nonlinearity. The confidence interval for the 
nonlinear case is indicated by 0 m: £ n and ®niax> f° r the li nea r case, the 
confidence interval is given by the dashed vertical lines, equidistance 
from 0. The search algorithm used in this study for finding the contour 
boundaries was presented in reference 24. This method tests a series of 
randomly selected points in parameter space to determine the position of 
the confidence region. Through many iterations, the limits of the region 
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are determined by retaining and updating the points on the boundary which 
maximize or minimize the unknown parameters. This turns out to be 
computationally very demanding even for problems with relatively few 
parameters. 


B. NONLINEARITY MEASURE FOR AIRCRAFT APPLICATIONS 

The following is a generalization and adaptation of Beale's develop- 
ment (ref. 21) of an intrinsic nonlinearity measure, , to the multivar- 
iable problem of airplane parameter estimation. This is an empirical 

measure of nonlinearity which, in this study, has demonstrated some utility 
in CIE problems. 

If P(0) represents the estimation space (or solution locus) in sample 
space, then P(0) is the point on the solution locus closest to the 
measurement Z. 0 is the point in parameter space which minimizes the cost 

function. If T(0) is defined as a point on a tangent plane at P(0) and W 

a N 

different values are chosen for the vector 0 near 0 (i.e. 0 W , w=l,.*.,W), 
then a crude measure of nonlinearity can be written as 

W 

Q. = l |P(9 ) - T(0 )| 2 (5-31) 

L I W W I 


A graphical representation of these quantities for a two dimensional sample 
space is shown in figure 5. is the sum of squares of the distances 

from the points P(0 W ) to the associated points T(0 W ) on the tangent 
plane at P(0*). Clearly, (ty depends on the number of points, P(0 W ), and 

on their distances from P(§). Beale suggests that these distances are 
proportional to the square of the distance of P(0 W ) to P(0). If D is 
defined as the sum of squares of the squared distances then 

W 

D = l |P(0 ) - P( ©) 1 4 (5-32) 

w=l 1 W 1 


and, thus, is normalized as 
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(5-33) 


N = n s 2 Q,/D 
* P t 


The values of this measure may still depend on the configuration or orien- 
tation of the points P(0 W ) relative to P(§), but it should not depend 
significantly on the number of points, P(0 W ), chosen or on their 
distances from P(§) (if not too large). 

To obtain the "intrinsic nonlinearity", , that Beale recommends, 

must be further restricted. is the value of when the para- 

meters are chosen such that T(6) is always at the foot of a perpendicular 
from P(0) on to a tangent plane at P($). In other words, is the 

minimum value of if the model and the experimental design are fixed 
(see fig. 5). 

The practical computation of the intrinsic nonlinearity measure is 
described in the following development. The sensitivities determined dur- 
ing the estimation process are needed in advance of the following calcula- 
tions. According to Beale, an empirical estimate of is 


N . = n s 2 Q,/D (5-34) 

4> P <P 

where np is the number of unknown parameters, s^ is the sum of squares 
of residuals. For the multiple output case, s^ is given by 

equation (5-30). 

The denominator, D, in equation (5-34) can be formulated as 

D= I f? (y ± ( 6 w ) - Y 1 (6)) T R -1 (y i ( 0 w ) - Y i (0))} 2 (5-35) 

w=l i=l 


where i is the time variable up to N data points. 

By letting T(0) be expressed as a first order Taylor expansion while 
using the sensitivity information from the estimation, can be computed 
as 
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Rewriting Cty and letting 
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1 1 W 1 
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This is now in the form of a standard least squares problem, 
is to find the value of which minimizes Q^; that is, 
distance (see fig. 5) given as 


d = 



2 


Therefore, the value of \\) which minimizes is $ given 

least squares minimizations: 


N rp “1 ^ T vs — 1 

for w=l,2,...,W: £ G , R G . £ G. R 6Y, 

W i=l 1 i=l 


Thus 


3i» w ) < 5 - 36 ) 

(5-37) 

(5-38) 

(5-39) 

(5-40) 

The problem 
minimize the 

(5-42) 

by W sets of 
(5-43) 
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(5-44) 


( {Y i - Gi«J T r‘(«Yi - CjtJ 

parameter error bound problem for aircraft, the 
following assumptions were made: 

1. Selecting W = 2n is sufficient to adequately sample the 
local surface of J(0) near 0. 

2. Selecting A0 as 

|a0 ± | = 5 0 (5-45) 

provides a reasonable distance from 0 to sample the surface 
J(0). The goal is to detect the nonlinearity from the 
tangent plane without going too far into the nonlinear range 
where the curvature (based on sensitivity information at 0) 
no longer applies • 

These assumptions represent a proposal for a unified approach in 
computing . Using this approach, results of various studies can be 
compared and the differences between confidence limits based on Cramer-Rao 
bounds and random search can be examined. N<j> and the error bounds deter- 
mined by random search can also indicate the effect of experimental design, 
especially input form and model error, on identif lability. 


W N 

1 i 

9 w=l i=l 


For application to the 
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Chapter VI 


APPLICATION TO SIMULATED AND REAL DATA 

The examples considered in this chapter demonstrate the ML/MNRES 
algorithm for estimating parameters and the search technique for estimating 
confidence intervals of the parameters. These methods are compared against 
commonly used techniques to provide a benchmark for comparison. The 

commonly used techniques are the ML/MNR for parameter estimation and the 
Cramer-Rao bounds for confidence interval determination. ML/MNR is used 
with both analytically- and numerically-determined derivatives. The 
Cramer-Rao bounds, taken from the information matrix, are adjusted to the 
95 percent confidence level. 

Only dynamical systems or airplane estimation problems are used in 
this study rather than classical test problems such as Rosenbrock's func- 
tion (ref. 13)* Using classical optimization problems, which usually 

require very little computational time to evaluate the cost function, could 
lead to different conclusions about the algorithms. For aircraft estima- 
tion problems, the bulk of computer time is spent performing integrations 
of the state and sensitivity equations. To prevent any bias in the results 
due to variations in programming efficiency or integration techniques, only 
estimation algorithms using the same integration method are compared. 

The performance of the methods used in this report will be evaluated 
with the following criteria: 

1. Accuracy of estimates 

2. CPU time to termination 

Termination is obtained when parameter and cost function fractional changes 
are computed to be within a specified precision. Both cost function 
change, Aj/j, and parameter change, A0/9, are required to be satisfied sim- 
ultaneously to prevent premature termination on a plateau where AJ<<1 and 
A0 is relatively large or on a steep slope where A0«1 and AJ is relatively 
large. 
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Besides estimate accuracy and CPU time to termination, an additional 
observation provided is the number of “equivalent evaluations." One unit 
of this measure is the amount of calculation required to integrate the 
system equations and to evaluate time histories of the output variables. 
Each method described in this report requires a different number of equiva- 
lent evaluations to make one update in the parameter estimates. This 
measure provides an indication of how efficiently information gained from 
system integrations is utilized. System integrations are the primary 
computational burden for any estimation method applied to dynamical 
systems. 

The examples in this study use both simulated data (examples I-III) 
and flight data (examples IV-VI). Except for the first two examples, the 
parameters estimated are the nondimens ional aircraft stability and control 
derivatives conforming to standard NASA notation. For examples I and II a 
simple linear system is integrated with an Euler integration method. Exam- 
ples III-VI involve the airplane problem and use the general equations of 
motion given by equations (2-5) to (2-15)* These equations are integrated 
with a fourth-order Runge-Kutta integration scheme. For comparison pur- 
poses, the same integration step size and the same computer (CYBER 175) are 
used in each example. 

For the airplane examples, the ML/MNRES algorithm is applied through 
program MAX. MAX is a very modular FORTRAN 5 code with dynamic memory. 
The modular format allows aerodynamic models or entire system models to be 
changed easily. The dynamic memory capability adjusts core memory automa- 
tically to the dimensions required. A block diagram of the general comput- 
ing scheme for ML/MNRES is given In figure 6. A block diagram for program 
MAX is given in figure 7 followed by table 1 defining the elements in 
figure 7. 


A* SIMULATED DATA STUDIES 

Simulated data offers three advantages for testing a new estimation 
algorithm. The first and most important advantage is that the true values 
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of the parameters are known; the second advantage is that the problem can 
be well posed and defined without modelling error; and third, the degree of 
complexity can be selected. In this study, the first two of three simula- 
tion examples use a single-input/double-output, linear second-order system, 

X = AX + BU, X (0) = 0 (6-1) 
s s s 

Y = X (6-2) 
s 

These examples are used to demonstrate the relative speeds and accuracy of 
the sample estimation algorithms and to initially indicate the preferred 
methods. The third simulation, uses a nonlinear aircraft model and simu- 
lates varying levels of measurement noise found in real flight data. This 
example demonstrates the accuracy of program MAX. 

A-l EXAMPLE I 

Example I demonstrates and compares the FPS, MNR, and MNRES optimiza- 
tion schemes in a simple ML estimation problem. The MNR method uses a 
finite difference method to compute derivatives. This satisfies one 
requirement of this study, which is to eliminate the need to derive analy- 
tical gradients. MNR generally performs with about the same speed using 
either numerically-determined derivatives or integrated sensitivity equa- 
tions (ref. 15). MNRES uses the same finite difference method as MNR to 
determine sensitivities during initialization; however, MNRES uses the 
recursive least squares form of the algorithm during optimization. Because 
of the small memory requirements to store time histories in this example, 
only one integration per pass is performed. The purpose of this example is 
to compare the relative performance of each method on a problem involving a 
simple dynamic system. 

Example I has six unknown parameters. The six unknown parameters in 
equation (6-1) are the four elements of the 2x2 system matrix, A, and two 
elements of the control input matrix, B. The input form was chosen as 
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sin t , 0 < t < 2 tt 


u - 


0 


t > 2ir 


and data points were generated every .25 seconds. Process and measurement 
noise were excluded for this example. 

Table II shows the true values, initial values, and final estimated 
values of the six unknown parameters. Figure 8 shows the input and 
response time histories. All three algorithms accurately converged to the 
true parameter values using only the first 5 seconds of data. The MNR 
method was 30 times faster than FPS and MNRES was twice as fast as MNR or 
60 times faster than FPS. The number of equivalent evaluations had similar 
ratios, i.e. 715:28:12. 

Table II shows clearly that optimization problems having reasonable 
starting values and involving time consuming cost function evaluations 
should not be solved with direct search schemes, such as FPS. This result 
is supported by an independent study using aircraft estimation problems in 
reference 29. Reasonable initial values tend to provide a more quadratic- 
like cost function for which Newton’s method is most effective. If reason- 
able initial values are not available, the FPS may be more attractive. In 
light of the results of Example I, further study concentrated only on the 
gradient methods. 

A-2 EXAMPLE II 

Example II is provided to demonstrate the robustness of MNRES compared 
with the commonly used MNR. The more common form of MNR with analytically- 
derived sensitivity equations is used to prevent any deterioration of the 
algorithm due to approximating the sensitivities. The system from 
example I is analyzed again except measurement noise is added and a pulse 
input is used to excite the system. Two cases are considered, each with 
different levels of measurement noise. The noise is zero mean and Gaussian 
with standard errors of 0.0001 and 0.001 for cases 1 and 2, respectively. 
Figure 9 shows time histories of the input and response variables for the 
two cases. Table III shows the estimation results. 
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In case 1 both methods produce equally degraded results; however, 
MNRES still converged to the same precision level more quickly. In case 2, 
with a severe noise level and limiting the information to 3 seconds of 
data, MNRES was unable to converge. The results showed that it was oscil- 
lating about a solution, unable to find a new parameter vector which would 
produce a lower cost. The MNRES used on this example had no special step- 
size control logic. The solution that was obtained, however, was as 
accurate as that obtained by MNR, which did converge. 

Meeting convergence requirements does not guarantee accurate results; 
the error in the estimates ranged from 5 percent error to 130 percent 
error. MNR had both the most accurate and the least accurate estimate. 
The importance (sensitivity) of a parameter to the model will significantly 
affect the accuracy of the estimate, particularly under these adverse con- 
ditions. Based on these examples, it appears that MNRES is computationally 
more efficient than MNR while providing the same level of accuracy. 

A-3 EXAMPLE III 

In this example, the accuracy and robustness of ML/MNRES are demon- 
strated by application to a nonlinear aircraft simulation with known 
measurement noise levels. In addition, program MAX is validated. For this 
example and all other aircraft examples, the computationally least 
demanding form of MNRES is used to compute sensitivities. This form uses 
the linear surface fit with equation (4-14) instead of the recursive least 
squares form with equation (4-22). The simulation involves a nonlinear 
lateral model of a general aviation aircraft. 

Three cases are considered: case 1 without any measurement noise; 
case 2 with a representative noise level typical of flight data for the 
aircraft; and, case 3 with twice the noise level of case 2. The standard 
errors of the simulated measurement noise are shown in table IV. In each 
case, the noise is zero mean and Gaussian. The simulated data was created 
by a fourth-order Runge-Kutta integration with a step size of .05 sec. 
Table V shows the terms used in the nonlinear aerodynamic model to create 
the simulation and the parameter estimates obtained through analysis of the 


60 - 



simulated data. Time histories are provided for the three cases in 
figure 10. The control inputs were the same for all three cases and are 
shown in figure 11. 

Program MAX was applied using two convergence criteria: Aj/j l.E-03 
and A0/0 <_ l.E-03. As expected, the estimates of the less easily identi- 
fied nonlinear terms, such as and Cjr , are more quickly 

Op 

corrupted as the noise levels increase; however, the estimates are still 
very reasonable and the time histories are accurately predicted. Table V 
shows that the MNRES method can be used effectively in estimating 
parameters for nonlinear aircraft systems. 

B. REAL FLIGHT DATA STPPIES 

In this section three examples are considered. In each example the 
model structure and initial parameter estimates were determined using the 
MSR program of reference 11. For the first two examples, the parameter 
estimation problem is solved by using two different ML programs. The first 
is program MAX which uses the ML/MNRES algorithm as described in 
example III. The second is program MAXLIK which uses an ML/MNR algorithm. 
This MNR algorithm integrates analytically derived sensitivity equations to 
obtain sensitivities. MAXLIK is a proven code for aircraft-parameter 
estimation documented in reference 31. In the last example, program MAX is 
used to compute parameter estimates and Cramer-Rao bounds. These bounds 
are adjusted to the 95 percent confidence level and compared with that 
obtained using the search method. 

For comparison purposes, both program MAX and MAXLIK use a fourth- 
order Runge-Kutta integration method with the same integration step size 
(.05 sec in example IV and .04 sec in examples V and VI). A convergence 
criterion is set at Aj/j = .001 for both codes. Program MAX normally uses 
an additional criterion restricting the parameter change A0/0; however, it 
is removed in these examples to ensure that both programs converge for the 
same criterion. Both programs use the same bias and scale-factor correc- 
tions to the flight data for each example. These corrections were deter- 
mined by using a compatibility program developed in reference 34. The same 
initial parameter values are used by both MAX and MAXLIK. 
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R-l EXAMPLE IV 

Example IV uses flight data from a general aviation aircraft operating 
at an angle of attack of 8°. The estimation problem involves the nonlinear 
lateral model. Table VI presents a comparison between parameter estimates 
and Cramer-Rao bounds from both MAX and MAXLIK. Initial values and 
sensitivities computed as 0 ^ j M j j are also given. Again, there is 
reasonable agreement between the two approaches. Cramer-Rao bound esti- 
mates tend to be a little higher for program MAX. This is probably due to 
their sensitivity to the derivative information. 

Repeating the calculations with program MAX, by allowing the sensitiv- 
ity ratios to be incorporated into the initializing derivative calcula- 
tions, provided a small improvement in the overall speed of the algorithm. 
This occurred because only one restart was required during the optimization 
process. More improvement would be realized in problems where restarting 
occurs several times. Time histories of the measured flight data and pre- 
dicted response using the estimated model are shown in figure 12. Execu- 
tion times for example IV are 793 seconds for program MAX and 1036 seconds 
for program MAXLIK. 

B-2 EXAMPLE V 

This example uses flight data from an advanced twin engine fighter 
operating at an angle of attack of 6°. A nonlinear longitudinal model Is 
used. Table VII presents a comparison of parameter estimates and their CR 
bounds for both MAXLIK and MAX (case 1). Also shown for each program is 
the time to reach convergence expressed in seconds. Program MAX converged 
very close to the same values as program MAXLIK except processing was done 
in one fourth the time. This example reflects the effectiveness of MNRES 
under fortunate conditions (that is, where the cost is well approximated by 
a quadratic and a moderate number of unknowns (11 parameters) are deter- 
mined). The quadratic nature of the cost is indicated by a very small 
value of Ity. The value of was .003 for this example. The mean 
value estimates of MAX are very good and the CR bounds are good but tend to 
indicate a slightly larger error bound than MAXLIK. 
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Although the Fisher information matrix is updated each iteration by 
both programs, program MAX delays updating the weighting matrix, R“1 , 
until convergence is achieved. The example is solved once more by program 
MAX and the weighting matrix is updated twice. The results are shown in 
table VII, MAX (case 2). The mean values are essentially the same since 
they are independent of the weighting matrix used, except possibly through 
some numerical errors. The standard errors are slightly closer to the 
MAXLIK results and further updating brings only very small improvements. 
These estimates were obtained by MAX in about 40 percent of the MAXLIK 
processing time. 

B-3 EXAMPLE VI 

The sixth example uses flight data from an advanced single engine 
fighter operating at an angle of attack of 4°, This example involves a 
nonlinear lateral model. In this example, 95 percent confidence intervals 
are estimated using two approaches. One approach Is based on the CR bound 
using program MAX and the other on a random search technique. The 95 
percent confidence intervals determined by each approach are presented in 
table VIII. In this example ffy was found to be .02; however, it was set 
to zero for the interval computations. Even with this correction, only a 
very small increase in interval size would be obtained. The confidence 
intervals determined by the search method are significantly larger than the 
corresponding CR estimates and indicate an asymmetric confidence interval. 

C. DISCUSSION OF RESULTS 

The general experience with ML/MNRES and the examples chosen for this 
study indicate that ML/MNRES will perform better than ML/MNR for estimation 
problems involving dynamic systems such as aircraft systems. In general, 
MNRES should perform well in any problems for which the Newton-Raphson 
family of methods is appropriate (that is, where the cost can be reasonably 
approximated by a quadratic). The results of this study also indicate that 
a search technique is needed to accurately assess the parameter error 
bounds in the nonlinear estimation problem; for the linear problem or prob- 
lems with very little nonlinearity, the Cramer-Rao bounds should agree with 
values determined by the search technique. 
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C-l PARAMETER ESTIMATION 

Two important goals in this study were to develop an estimation algo- 
rithm which, first, eliminated the requirement to reformulate the algorithm 
for each new model and, second, provided a more efficient method to deal 
with the computationally more burdensome nonlinear problems. The first 
goal has been surpassed through the ML/MNRES algorithm coded in program 
MAX in two respects: (1) it does not require the derivation of sensitivity 
equations to complete the formulation of the algorithm and the modular form 
of MAX allows easy application to any system; and (2) it provides many 
computationally efficient options to the user as to how the sensitivities 
will be approximated (accuracy and order of derivatives, also options for 
memory requirements) and these options are readily incorporated because of 
the modular format. The second goal has been demonstrated in the examples 
but further discussion will clarify the conditions under which this goal 
has been met. 

The first two examples use a simulated linear system with and without 
noise. This system is readily identifiable except when severe noise and 
nonoptimal inputs are included. The initial values were relatively close 
to the final solution and so allowed a good quadratic approximation of the 
cost function thus providing a condition conducive to convergence. The two 
Newton-Raphson methods, MNR and MNRES , were substantially faster than the 
search method as expected under these conditions. MNRES, however, capital- 
ized more efficiently than MNR on the information obtained from each inte- 
gration of the system equations. Each integration of the system equations 
provides information which is immediately incorporated into the numerical 
process when using MNRES. When using MNR, n p +l system integrations 
(equivalent evaluations) are required before each updating operation, MNRES 
requires only one; for example I, the ratio of equivalent evaluations was 
28:12. MNRES made much more efficient use of the system integrations. The 
results indicated both MNR and MNRES to be very fast relative to the search 
technique, thus search methods were eliminated from any further study. The 
MNRES approach was a little more than twice as fast as MNR. 
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The third example demonstrated that the ML/MNRES algorithm was a 
viable method for a nonlinear aircraft estimation problem with both realis- 
tic and heavy noise levels. This example provides confidence in the 

ML/MNRES approach. However, since it is a simulated example, it cannot be 
accepted as conclusive. Simulations always provide optimal conditions for 
estimation algorithms since problems such as modelling error, bias errors, 
unknown noise spectra and general data incompatibility are not present. 

Unlike simulated data, real flight data often present problems (as 

just mentioned) for any estimation method; these problems may slow the 
convergence process or even stop it. The last three examples consider real 
flight data; these examples were specially selected to reflect differing 
levels of difficulty for the estimation algorithms. Examples IV and VI 
compared with example V demonstrate a representative range in the degree of 
difficulty (measured by computational effort) for the algorithms and, 
unsurprisingly, the degree of nonlinearity (measured by N<j, ) is also 
largely varying. N^ differed by an order of magnitude between example V 

and VI (VI being more nonlinear). ML/MNRES was again faster than the 

benchmark program, ML/MNR. For the more nonlinear examples, convergence 
time for ML/MNRES was 70 to 80 percent of the time required for ML/MNR; in 
the less difficult problem, ML/MNRES required only 40 percent of the bench- 
mark time. These examples give some credence to the superiority of 
ML/MNRES. However, they also indicate a large variability in the superior- 
ity. 

The variability appears to be that as the degree of nonlinearity 
increases, the methods approach the same speed of convergence. The compu- 
tational advantage of ML/MNRES tends to be reduced as the nonlinearity 
increases. A moderate number of unknowns are used in each case (examples V 
and VI) so the advantage due to the difference in n s +n s n p Integra- 
tions per pass and n s integrations per pass is probably a small factor 
(see IV-B). The main factor, however, is the quality of the quadratic 
approximation of the cost function which, of course, is directly related to 
the degree of nonlinearity of the cost. Both methods are slowed as the 

nonlinearity increases or the quadratic approximation becomes poorer, but 
the MNRES method is much more dependent on the quality of the quadratic 
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approximation since in effect it is an approximation of the MNR method. As 
the nonlinearity increases, MNRES looses its advantage over MNR. 

Figure 13 offers a graphical view of the performance of MNR and MNRES 
in example V, where the quadratic approximation is fairly good. The graphs 
show the value of the convergence criterion versus CPU time. Program 
MAXLIK using MNR follows a typical convergence pattern; the small oscilla- 
tions before convergence are due to the updating of the weighting matrix, 
R, each pass after the criterion falls below .01 in value. This approach 
with MNR has been found to be beneficial in these problems. Program MAX, 
on the other hand, updates the information matrix each pass but holds the 
weighting matrix constant until the first convergence is achieved. At this 
point the final parameter estimates are obtained, however, the Cramer-Rao 
bounds (determined from the information matrix) are not converged. The 
information matrix, M, will not converge until the weighting matrix, R - ^ , 
is updated sufficiently. This is understandable since the parameter esti- 
mates are asymptotically independent of the weighting matrix whereas the 
Cramer-Rao bounds depend on the information matrix which in turn depends on 
the value of R. Therefore, two more cycles are made to convergence ensur- 
ing that the weighting matrix has stabilized. Note that the first step in 
each method takes the same amount of time and achieves the same reduction 
in cost; this is expected since initializing MNRES requires the same 
computations as the first pass in MNR. 

A key feature of ML/MNRES is in the method of updating the information 
matrix. Although both ML/MNRES and ML/MNR update the information matrix 
each pass, they each do it quite differently. One well known way to 
improve the speed of techniques which involve the Hessian matrix, or 
approximations to it, is to hold the information matrix, M, constant for 
one or more iterations. This reduces the amount of integration required 
per pass to be the same as in MNRES since no sensitivity equations are 
integrated. How many iterations M can be held constant is unknown and 
unknowable in advance. So there are two alternatives generally known and 
used. One is to integrate the sensitivity equations each pass requiring 
maximum computational effort but giving maximum accuracy to the optimiza- 
tion process. The other alternative is to skip integrating the sensitivity 
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equations for a necessarily infrequent number of iterations to hopefully 
increase convergence speed. In methods like the ones considered in this 
study where large steps in the optimization process may occur, there is a 
lot of danger in not updating the information matrix. ML/MNRES finds an 
effective compromise between the alternatives discussed above. 

The compromise is achieved by updating the information matrix each 
pass. However, only the information obtained from integrating the equa- 
tions of motion once each pass is incorporated. Thus, a compromise is 
achieved where updating is occurring at minimal computational cost. 
Because of the limited information to update the information matrix, a sub- 
optimal but computationally more efficient path is followed to conver- 
gence. The result is that ML/MNRES requires many more passes to reach 
convergence, but each pass requires much less computational effort than 
ML/MNR. The net result is much faster convergence depending on the degree 
of nonlinearity of the cost and the quality of the quadratic approximation 
used by the method. 

C-2 CONFIDENCE INTERVAL ESTIMATION 

Confidence intervals obtained in example V were found to be very close 
to the CR bounds adjusted to the 95 percent confidence level. In addition, 
the value of was very small and convergence occurred relatively 
quickly. This indicates that the cost function was very well approximated 
by a quadratic function. In analogy to figure 4, the construction of the 
confidence intervals for example V would place the dashed and solid lines 
virtually on top of each other at the error level selected. The quality of 
the quadratic approximation is confirmed by the very fast convergence of 
MNRES relative to MNR. 

Confidence intervals obtained in the sixth example were found to be 5 
to 10 times the size of the CR bound adjusted to the 95 percent confidence 
level (Table VIII). This is in agreement with references 18 and 23 on the 
values generally found in analyzing actual flight data. Reference 23 used 
the search technique and reference 18 estimated the confidence intervals by 
computing an ensemble average. This also indicates that this cost function 
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is much more nonlinear than that for the first example, and it is not well 
approximated by a quadratic function at the error level considered. This 
is confirmed by the relative speed of convergence between MNRES and MNR. 
MNRES, with two updates of R, required 85 percent of the time MNR required. 

It appears from the examples considered that a primary factor in 
determining the confidence interval size for airplane stability and control 
derivatives is the degree of nonlinearity of the cost function. Other 
factors, such as bias errors, modelling errors, noise level and noise spec- 
tra, may contribute directly to confidence interval size or may manifest 
themselves as part of the nonlinearity of the cost function. In this 
study, the other factors were not tested to determine their contribution. 

At present, the random search technique is the only method to deter- 
mine confidence bounds accurately. Clearly, the CR bounds, which are tied 
to the quadratic approximation inherent in the information matrix, will 
always be different from the parameter variance. This difference will 
mainly depend on the quality of the quadratic approximation. The only 
disadvantage of the search technique is its relatively poor convergence 
rate combined with the large computational effort required in this type of 
problem. Although Beale’s measure of nonlinearity, , was designed to 
correct the confidence level in the CIE problem, there seems to be more 
utility in considering (or some similar measure) as a way to discern 
if the lengthy computations of the random search are needed. If very 
little nonlinearity exists, the user can be reasonably confident that the 
Cramer- Rao bounds provide accurate error bound information. 


68 



Chapter VII 


SUMMARY AND CONCLUDING REMARKS 
A. SUMMARY 

An algorithm for maximum likelihood (ML) estimation is developed with 
an efficient method for approximating the sensitivities. The algorithm is 
applicable to most parameter estimation problems and is particularly suited 
for nonlinear, multivariable, dynamic systems. The ML algorithm relies on 
a new optimization method closely related to a modified Newton-Raphson 
(MNR) technique; the new method is referred to as a modified Newton-Raphson 
with estimated sensitivities (MNRES). 

MNRES determines sensitivities by using slope information from local 
surface approximations of each output variable in parameter space. The 
fitted surface allows sensitivity information to be updated at each itera- 
tion with a significant reduction in computational effort. MNRES deter- 
mines the sensitivities with less computational effort than using either a 
finite-difference method or integrating the analytically-determined sensi- 
tivity equations. The choice of the type of surface (for example, nth- 
order polynomial or spline, etc.) and the method of fitting the surface 
(for example, least squares or simply solving simultaneous equations) is 
made by the user to suit the particular need. MNRES eliminates the need to 
derive sensitivity equations for each new model, thus eliminating algorithm 
reformulation with each new model and providing flexibility to use model 
equations in any format that is convenient. 

Two surface-fitting methods are discussed and demonstrated, while 
other possibilities are indicated. Comparisons are made between MNRES and 
other commonly used optimization methods such as a search method called the 
flexible polyhedron search (FPS) and a gradient method called the modified 
Newton-Raphson method. Several sample problems are solved to compare the 
techniques. Simple linear systems are used at first, and then nonlinear 
aircraft estimation problems are solved by using both real and simulated 
data. MNRES is found to be equally accurate and substantially faster than 
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the commonly used techniques. The reduction in computational effort 
provided by MNRES is dependent on several factors: the choice of surface- 
fitting method; the number of unknown parameters; data quality; accuracy of 
the sensitivity calculations; and, particularly, the degree of nonlinearity 
of the cost function, 

A search technique for determining the confidence limits of ML para- 
meter estimates is applied to nonlinear estimation problems for airplanes. 
The confidence intervals obtained by the search are compared with Cramer- 
Rao (CR) bounds at the same confidence level. It is observed that the 
degree of nonlinearity of the cost function is an important factor in the 
relationship between CR bounds and the error bounds determined by the 
search technique. The CR bounds were found to be close to the bounds 
determined by the search when the degree of nonlinearity was small. The CR 
bounds were 5 to 10 times too conservative (too small) when the nonlinear- 
ity was significant. Beale’s measure of nonlinearity is developed in this 
study for airplane identification problems; it is used to empirically 
correct confidence levels for the parameter confidence limits. The primary 
utility of the measure, however, was found to be in predicting the degree 
of agreement between Cramer-Rao bounds and search estimates. 

B. CONCLUDING REMARKS 

The primary contribution in this study is an efficient maximum likeli- 
hood estimation algorithm. However, inherent in this study is a suggested 
methodology for solving the nonlinear airplane identification problem. The 
use of a modified stepwise regression in conjunction with several testing 
criteria is suggested to determine the airplane aerodynamic model struc- 
ture. This very efficient scheme was developed in a prior study to handle 
the widely varying model structure in nonlinear flight regimes. A maximum 
likelihood scheme (ML/MNRES), developed in this study, is then recommended 
to obtain optimal parameter estimates. This method is more efficient than 
other commonly used techniques in airplane estimation problems and provides 
some practical computing options. Finally, a random search procedure is 


70 



required to determine parameter confidence limits for the nonlinear case. 
This is used in conjunction with Beale’s measure of nonlinearity (adapted 
to the airplane problem) to make an empirical correction to the confidence 
level. It is also used to determine if the extensive calculations of the 
random search are needed to estimate confidence limits. 

The new optimization algorithm, MNRES, has three advantages over other 
commonly used techniques. The first advantage is that the algorithm 
removes the need to derive sensitivity equations for each new model; this 
eliminates the computational burden of integrating the sensitivity equa- 
tions during each iteration of the algorithm. This also provides much 
flexibility, allowing the model equations to be in any format that is 
convenient - such as splines, polynomials, or a nonanalytic form. Also the 
quickly varying model structure sometimes found in the nonlinear regimes is 
readily handled. The second advantage is that the algorithm is effective 
for a variety of surface fitting methods chosen to fit the output vector 
surface in parameter space (needed for sensitivity estimation); this allows 
the user to choose a surface-fitting method best suited to the problem. An 
approach is discussed which reduces storage requirements with little addi- 
tional computation. The third advantage of the algorithm is that it 
reduces the computational effort in comparison with the commonly used 
modified Newton-Raphson (MNR) method. For small problems (fewer than 15 
parameters to be estimated), the reduction can be substantial. For larger 
nonlinear problems, the reduction may be more modest; however, improvements 
may still be significant if data quality, signal compatibility, and sensi- 
tivity calculations are good. Even though the application of interest for 
this study was an aircraft operating in nonlinear flight regimes, the 
approach should be effective for many other nonlinear, dynamic systems. 
Based on this study, the ML/MNRES algorithm generally performs better and 
offers more versatility than the commonly used ML/MNR algorithm. 

The suggested methodology recommends a random search technique to 
obtain parameter confidence limits for the maximum likelihood estimates. 
Since the nonlinear problem does not lend itself to an explicit analytical 
solution, the search uses a random sampling algorithm to find the confi- 
dence limits; unfortunately, this method is computationally demanding, 
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particularly for cases with a large number of unknowns. Unless sufficient 
repeated measurements are available, it is the only method to accurately 
determine confidence region boundaries in the nonlinear case. Beale’s 
measure of nonlinearity is used to provide an empirical correction to the 
confidence level used by the search. Although this was Beale’s intended 
use, it has little affect on the confidence limits for airplane applica- 
tions. However, it was shown that the degree of nonlinearity and the 
degree to which the Cramer-Rao bounds and the random search estimates agree 
is closely related. Therefore, it is recommended to use this or some 
similar measure to determine the necessity of the search calculations. 

If further studies are made with MNRES, it should prove beneficial to 
use more efficient inversion schemes than the standard Gaussian elimination 
used in this study. This may improve the algorithm for larger numbers of 
unknowns. Also, further consideration should be given to defining the 
relationship confidence intervals and the nonlinearity of the cost function 
have with other factors such as bias errors, modelling errors, input form, 
and noise spectra. In addition, more investigation into measures of non- 
linearity and their best computing schemes would be advantageous. Nonlin- 
earity measures may be useful for reflecting the quality of the experiment 
since parameter error bounds will vary with model error and optimality of 
the input form. Finally, significant computational savings would be 
achieved if the confidence limits for the nonlinear estimation problem 
could be determined using gradient techniques rather than the computation- 
ally demanding search scheme used in this study. 
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Figure 1.- Airplane body axis system with forces and moments 
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Figure 2. Two steps of a two-dimensional flexible polyhedron search. 




Figure 3.- Linear-surf ace fit for two iterations of MNRES 




Figure 4.- Construction of confidence intervals for one-dimensional case 
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Figure 5.- Two-dimensional sample space with solution locus, P(6), and 
tangent plane, T(6), at P(0). 





Figure 6.- Computing scheme for ML/MNRES. 





Figure 7.- Flowchart for program MAX. 
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Figure 9.- Time histories of input and response variables for Example II 
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Figure 10.- Continued. 
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Figure 10.- Concluded. 
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Figure 12 .- Measured and predicted responses for lateral maneuvering using 
real flight data. 
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Figure 13.- Convergence criterion versus CPU time for programs MAX and 
MAXLIK. 



Table 1. Primary subroutines for program MAX and flowchart block: 
definitions 


Subroutine 

Description 

AERO 

Computes aerodynamic forces and moments with 
selected aerodynamic model 

COST 

Computes residuals, fit error, R7*and cost 

DIFFEQ 

Computes state derivatives from selected equations 
of motion 

EST 

Computes updated parameter estimates 

HICOST 

Determines if new estimates reduce cost and 
updates storage of outgoing and incoming 
parameters and response time histories 

INT 

Main integration subroutine, computes initial 
conditions and input arrays 

MASTER 

Primary subroutine represented by flowchart; 
handles initializations, I/O operations and memory 
management 

OUTPUT 

Computes selected output time histories for cost 
function and plot routines 

RK4 

Fourth-order Runge-Kutta integration scheme 

SENE ST 1 

Computes sensitivities using a finite difference 
method 

SENEST2 

computes sensitivities using a selected surface- 
fitting method 

Decision Blocks 

Definitions 

FAIL 

test if new estimates reduce cost 

PASSES 

test for maximum number of allowed passes 

PASS #1 

test for first pass 

RESTARTS 

test for maximum number of restarts 

R UPDATES 

test for maximum number of weighting matrix 
updates 
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Table II. Comparison of estimates and computation time for FPS, MNR, and MNRES using 
a linear system without measurement noise (Example I) 


Unknown pa r arae t e r s , 
0 

True 

Value of 
0 

Initial 
Value of 
0 

Fina: 

L estimated values 
ising method - 

FPS 

MNR 

MNRES 

9 1 

0 . 

0.01 

-0.12 E-03 

0.89 E-07 

0.73 E-06 

0 2 

-1.5 

-1 .6 

-1.5 

-1.5 

-1.5 

9 3 

1.0 

1.1 

1.0 

1.0 

1.0 

9 4 

- .5 

- .6 

-.5 

-.5 

-.5 

0 5 

.2 

.25 

.2 

.2 

.2 

6 6 

.1 

.15 

.1 

.1 

.1 

| 

Cost 



0.14 E-08 

0.61 E-10 

0.11 E-07 

Equivalent evaluation 



715 

28 

12 

At, sec 



2948 

106 

47 
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Table III. Comparison of estimates and computation time for MNR, and MNRES 
using a linear system with two noise levels (Example II) 


(a) Case 1 


Unknown parameters , 
0 

True 

Value of 
0 

Initial 
Value of 
0 

Final estim 
using m 

ated values 
ethod - 

MNR 

MNRES 

e i 

0 . 

0.01 

-0.0675 

-0.0684 

9 2 

-1.5 

-1.6 

-1.471 

-1.471 

9 3 

1.0 

1.1 

1.009 

1.010 

9 4 

- .5 

- .6 

- .449 

- .448 

9 5 

.2 

.25 

.202 

.202 

0 6 

.1 

.15 

.098 

.098 

Cost 



0.105 E-06 

0.105 E-06 

Equivalent evaluation . . . 



42 

12 

At, sec 



77.54 

24.08 


(b) 

Case 2 



True 

Initial 

Final estimated values 

Unknown parameters, 

Value of 

Value of 

using method - 

0 

0 

0 






MNR 

MNRES 1 

0 

i 

0 . 

0.01 

- .705 

- .410 

1 

0 

-1.5 

-1.6 

-1.228 

-1.549 

2 





0 

1.0 

1.1 

1.757 

.799 

3 





0 

- .5 

- .6 

.159 

- .238 

4 





0 

.2 

.25 

.210 

.251 

5 





0 

.1 

.15 

.087 

.037 

6 





Cost 



0.104 E-04 

0.122 E-04 

Equivalent evaluation ... 



70 

27 

At, sec 



134.44 

56.05 


^MNRES did not converge. 
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Table IV. Standard errors of simulated measurement noise 



Standard deviations for - 

Output Variable 




Case 1 

Case 2 

Case 3 

3 , rad 

0 

0.010 

Jmk-niFHiBf. 

p, rad/ sec ... 

0 

0.010 


r , rad/sec . . . 

0 

0.010 

0.02 

<f> , rad 

0 

0.005 

0.01 

a , g units .. 

y 

0 

0.005 

0.01 
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Table V. Simulated-data analysis of a nonlinear aircraft system using 
ML/MNRES 


Unknown 
parameter , 

Simulation Values 

Parameter estimates 

for - 

e 


Case 1 

Case 2 

Case 3 

°i.o 

0.13 

0.1299 

0.1298 

0.1295 

C Y 

e 

-.41 1 

-.4136 

-.4261 

-.4401 

C Y 

p 

- . 146 

-.1524 

-.1874 

-.2379 

C Y 

r 

.63 

.6686 

.6070 

.5412 

° Y 6a 

-.053 

-.0618 

-.0733 

-.0872 

V 

.075 

.0794 

.0775 

.0751 

c t f 0 

0 

.0001 

-.0003 

-.0005 

c * 

*6 

-.123 

-.1223 

-.1228 

-.1240 

C £ 

P 

-.397 

-.3988 

-.4026 

-.4094 

C £ 

r 

.257 

.2573 

.2409 

.2239 

C £ 

6a 

-.182 

-.1815 

-.1778 

-.1755 

C £ 

*r 

.077 

.0067 

.0059 

.00497 

c = 

T P 

2.63 

2.6254 

2.519 

2.4359 

C 

n,0 

0 

-.00005 

-.00008 

-.0001 

C 

n 6 

0 

.000003 

.0001 

.0005 

C 

n 

P 

-.15 

-.1488 

-.1524 

-.1558 

C 

n 

r 

-.083 

-.0828 

-.0861 

-.0911 

C 

n 6r 

-.0431 

-.0425 

-.0434 

-.0445 

C 

n 

ar 

1.7 

1.7343 

1.4419 

1.0118 
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Table VI. Real-data analysis of a nonlinear aircraft system 
using programs MAX and MAXLIK 


Unknown 

parameter, 


Initial Value 
of 0 


Program MAX 


Program MAXLIK 


- .479 



0.0061 0.0006 0.0213 


-.4603 


.0075 -.4608 


.0005 

.0067 


0 j M jj 


.7533 E+03 
.3756 E+05 


- .186 


-.1378 


.0485 -.0604 


.0439 


.9373 E+03 


.6677 


.0289 


.6209 


.0256 


.1915 E-04 


- .08 


-.0504 


.0166 -.0375 


.0150 


.8129 E+03 


.0814 


.0043 


.0763 .0037 


.9922 E+03 


.4300 


.0592 


.4512 


.0504 


.9399 E+03 


.0002 .00005 -.0001 .00005 .9618 E+03 

-.0872 .0015 -.08 .0013 .1493 E+06 


-.5320 


.0102 -.4823 


.0085 


.4529 E+07 


.1700 


.0043 


.1543 


.0045 


.6838 E+04 


-.2035 


.0036 -.1852 


.0031 


.7969 E+05 


.00055 .00024 -.0012 


.00072 


.1340 E+04 


-.2707 


.0116 -.2105 


.0091 


.6327 E+04 


-.00063 .00003 -.002 .00002 .1909 E+04 

.0323 .00045 .0329 .0004 .1515 E+06 


-.1043 


.0026 -.0916 


.0022 


.1400 E+06 


I- . 1462 


-.1534 


.0017 


.5039 E+05 


-.0044 


-.0037 


.0009 


.1780 E+04 


.0003 -.0532 


.9048 E+07 
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Table VII, Comparison of parameter estimates, their 

standard errors and time to reach convergence 
for programs MAX and MAXLIK 



MAXLIK 

Mi 

AX 


CPUT = 

= 342* 

CASE 1 

CASE 2 


CPUT 

= 105 

CPUT 

= 157 


0 

a(§)** 

0 

a(0) 

6 

o(0) 

Cx o 

.17017 

. 208E-03 

.17078 

. 249E-03 

.17070 

.212E-03 

Cx a 

.9464 

•616E-02 

.9245 

•703E-02 

.92511 

.555E-02 

Cx 6e 

.2789 

.404E-02 

.2754 

. 1 16E-01 

.2755 

.876E-02 

C- 

z o 

-.83946 

.896E-03 

-.84335 

. 106E-02 

-.84252 

. 103E-02 

C z 

z a 

-5.311 

. 230E-01 

-5.197 

.223E-01 

-5.194 

.215E-01 

c z 

z q 

-18.7 

.162+01 

-20.3 

. 189E+01 

-20.5 

. 177E+01 

Cz 6e 

-.618 

.264E-01 

-.566 

.350E-01 

-.570 

.324E-01 

c m 0 

|- .00125 1 

.828E-04 

-.001610 

.942E-04 

-.001555 

.937E-04 

C ™a 

-.5129 

. 102E-02 

-.5186 

.115E-02 

-.5183 

. 1 10E-02 

% 

-16.95 

•157E+00 

-19.06 

. 144E+00 

-18.83 

. 144E+00 

Cm 6e 

-1.3409 

. 576E-02 

-1.4150 

.447E-02 

-1.4075 

.461E-02 


* Central processing unit time in sec. 
** Cramer-Rao bound. 
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Table VIII. Parameter estimates and their confidence limits using 
Cramer-Rao boards and a random search technique 


Para- 

meter 

A 

e 

95% Confidence Intervals 

Cramer-Rao 

bound 

Random 

Search 

c *e 

-.77 

± .27 

.92 

-.89 

C Y6r 

.18 

± .27 

.87 

-.89 

C H 

-.228 

± .021 

.042 

-.10 

c® 

x p 

-.88 

± .13 

.30 

-.99 

C *r 

-4.20 

± .98 

2.1 

-5.5 

C *6a 

-.152 

± .020 

.036 

-.11 

c np 

.0826 

± .0040 

.011 

-.013 

C n 

n P 

-.078 

| ± .019 

.050 

-.10 

C n r 

-1.24 

± .19 

.49 

-.76 

Cn <Sr 

-.0860 

± .0064 

.020 

-.021 



